VMware Workstation lab: Nested ESXi and vCenter Server

In a previous post (https://enterpriseadmins.org/blog/virtualization/nested-vmware-workstation-lab-using-linked-clones/) I mentioned a nested ‘smash lab’ using VMware Workstation. This post will focus on a couple of the component VMs: lab-esxi-02 and lab-esxi-03, which are nested ESXi 7.0.3 and ESXi 8.0.0 hosts, each containing a corresponding vCenter Server Appliance.

These two nested ESXi hosts only differ in the version of ESXi that is installed. Each has 2 vCPU, 20GB RAM, a 16GB SCSI 0:0 device (for ESXi install) and a 100GB SCSI 1:0 device (for a VMFS datastore). I decided to manually install ESXi, so that I could choose specific builds. I installed 1 patch prior to the latest available, just in case I had a need to attempt an upgrade I wanted to ensure that an upgrade was available. Other than the default next > next > finish installs, I only made 2 changes to these hosts:

  1. Configured networking from the DCUI. This involved setting the IP address to 172.16.10.2 or 172.16.10.3, where the last octet matches the host name, as well as setting the default gateway and DNS server IP to 172.16.10.1, which is the lab side of the domain controller.
  2. Create a VMFS datastore named local-hdd that used the 100GB SCSI 1:0 device. I could have automated this, but since it was a super simple task I decided to just knock it out in the UI.

Once ESXi was installed, I added a deployed corresponding vCenter Server Appliances to the local datastore. For this I first created DNS records for the appliances with associated IP addresses, created a copy of the <cd-rom>\vcsa-cli-installer\templates\install\embedded_vCSA_on_ESXi.json file, specified values for hostname, datastore name, etc and then deployed through the command line with .\vcsa-deploy install C:\tmp\lab-vcsa-13.json --accept-eula --acknowledge-ceip --no-ssl-certificate-verification and waited until the process completed. I ran into two different challenges with this. First, running a nested 64-bit guest requires that “Virtualize Intel T-x/EPT or AMD-V/RVI” be selected on the processor configuration. Credential Guard was enabled on my system and had to be disabled to allow for the VCSA to start. The other challenge that I encountered was that by default ntp_servers is defined as time.nist.gov in the JSON configuration file. I didn’t change this, but deployed the VCSA when my laptop could not reach the internet. The VCSA startup failed and reviewing log files showed all time stamps with a 1970-01-01 timestamp. I remembered that NTP was set to an internet address, so I tried to deploy again after updating the JSON file to point to time.example.org, which was a CNAME configured to the DNS server lab-mgmt-01.example.org, and this worked without error.

Once the VCSA was running, I debated on whether or not I should create inventory (like a new Data Center object, containing a Cluster with my nested ESXi host, etc) but decided to leave the VCSA completely unconfigured. This will allow me to address the configuration each time the environment is deployed. In the past I’ve created a minimal config, so time will tell which route is best. If having a minimal configuration is more practical, I can address that issue and create a new snapshot if needed.

With my ESXi host & VCSA deployed, I powered down the VCSA and ESXi host. Once the Workstation ESXi VM was powered off, I created a new snapshot so this could be used as a parent virtual machine for future linked clones.

Posted in Lab Infrastructure, Virtualization | Leave a comment

VMware Workstation lab: Management Console

In a previous post (https://enterpriseadmins.org/blog/virtualization/nested-vmware-workstation-lab-using-linked-clones/) I mentioned a nested ‘smash lab’ using VMware Workstation. This post will focus on one of the component VMs: lab-mgmt-01, which functions as the management console/GUI, domain controller, DNS Server, Certificate Authority, and NAT gateway.

This VM is a Windows Server 2022 Standard virtual machine with a very generic install. A lot of services will end up running on this VM and its likely that it will be used by nearly every test. Therefore, I decided to try and document the configuration in PowerShell, in case I ever wanted to update it to a newer version of Windows via re-deploy.

This first section of code will rename the computer

# Set IP address for interface #2; see previous post for network diagram.
New-NetIPAddress -InterfaceAlias 'Ethernet1' -IPAddress 172.16.10.1 -PrefixLength 24 -Confirm:$false

# Set the computer name
Rename-Computer -NewName 'lab-mgmt-01' -Restart:$true

### We need a reboot after a name change before a dcpromo, this should happen automatically as part of above Rename-Computer setp ###

The second set of code will focus on installing the active directory components and promoting this system to a domain controller for a new forest named example.org. I like using example domain names as these are specifically reserved by RFC 2606 – Reserved Top Level DNS Names (ietf.org) and make documentation/screenshots look nice.

# Install AD and DNS roles
Install-WindowsFeature -name AD-Domain-Services,DNS -IncludeManagementTools

# Make me a new AD Forest
Import-Module ADDSDeployment
Install-ADDSForest `
-CreateDnsDelegation:$false `
-DatabasePath "C:\Windows\NTDS" `
-DomainMode "WinThreshold" `
-DomainName "example.org" `
-DomainNetbiosName "EXAMPLE" `
-ForestMode "WinThreshold" `
-InstallDns:$true `
-LogPath "C:\Windows\NTDS" `
-NoRebootOnCompletion:$false `
-SysvolPath "C:\Windows\SYSVOL" `
-Force:$true

After the system is promoted to a domain controller it will automatically reboot. When the system comes back up there are a few more services we need to install like the Certificate Authority and Routing components.

### PART 2, AFTER AD REBOOT ###
Install-WindowsFeature Routing,Adcs-Cert-Authority,Adcs-Web-Enrollment -IncludeManagementTools


# Configure RRAS
Install-RemoteAccess -VpnType RoutingOnly
$ExternalInterface="Ethernet0"
$InternalInterface="Ethernet1"

cmd.exe /c "netsh routing ip nat install"
cmd.exe /c "netsh routing ip nat add interface $ExternalInterface"
cmd.exe /c "netsh routing ip nat set interface $ExternalInterface mode=full"
cmd.exe /c "netsh routing ip nat add interface $InternalInterface"

# Configure Certificate Authority
Install-AdcsCertificationAuthority -CAType EnterpriseRootCA -CACommonName rootca.example.org -ValidityPeriod:Years -ValidityPeriodUnits 10 -Confirm:$false
Install-AdcsWebEnrollment -Confirm:$false

There were a few more steps that I completed manually.

  1. In the Certification Authority console > right click the CA rootca.example.org > Security tab > select the Administrators group > check the box for Request Certificates. This will allow the default admin user to be able to request certificates.
  2. Create a DNS record for time.example.org to be a CNAME back to the domain controller. This allows the domain controller to provide time to the ESXi hosts & VCSA and allows things to work as expected even when disconnected from the internet.
    Note: any DNS edits will need to happen in this parent VM to be available for other lab exercises. In addition to this time record that was initially created, it might be helpful to create extra records that might point at the container host, for services like SMTP.
  3. Configured DNS to disable root hints and set forwarder to home network DNS Server (could have pointed to Google or CloudFlare).
  4. Installed PowerCLI module and configured some common settings: Install-Module vmware.powercli; Set-PowerCLIConfiguration -InvalidCertificateAction:Prompt -ParticipateInCeip:$true -Scope:AllUsers
  5. Ran VMware Horizon OS Optimization Tool to disable services like screensaver and Windows Update.
  6. Configured autologin in Workstation under VM > Settings > Options > Autologin so that we automatically login as the domains Administrator account.

I rebooted the VM a couple of times to make sure that Autologin worked, services would startup and everything was working as expected. I finally powered down the VM and created a new snapshot so this could be used as a parent virtual machine for future linked clones.

Posted in Lab Infrastructure, Virtualization | Leave a comment

VMware Workstation Lab Overview – with Linked Clones

I like to have easy access to a variety of lab environments. I keep a fairly active home lab which has a focus on persistent virtual machines — like running copies of various vCenter Server releases, vRealize Suite, Horizon, etc. I like to consider these sort of ‘production’ as when they break I will typically troubleshoot and repair them in place. However, I also like to have very disposable environments that can be destroyed and easily recreated to iterate through some testing. I’ve used an environment before that referred to these as ‘smash labs’ because you could snapshot and smash things as necessary. I’ve been using VMware Workstation and linked clones for a while to provide this sort of ‘smash lab’ environment. I recently had a need to switch PCs and decided to rebuild this environment and document the process. The next few blog posts will focus on various aspects of this project, starting with the end result, and then posts covering the builds for each individual VM.

  1. lab-mgmt-01: The management console/GUI that also acts as a domain controller, DNS Server, Certificate Authority, and NAT gateway.
  2. lab-esxi-02 and lab-esxi-03: I’ll cover these two VMs in one post because they are very similar. One is a nested ESXi 7.0.3 host and the other is a nested ESXi 8.0.0 host, each containing a corresponding vCenter Server Appliance.
  3. lab-dock-14: A Photon OS VM with docker and nfs-server services enabled.

We can see all four of these VMs in the following screenshot. You can see that I named them with a parent_ prefix, and that is because I don’t typically power on these VMs, but instead create linked clones that are disposable. This allows me to create various instances of these VMs and switch between them as needed.

For example, if I need to test something, like an SSL certificate replacement script for vCenter Server 7.0.3, I would create linked clones of parent_lab-mgmt-01 and parent_lab-esxi-02 by right clicking the VM > Manage > Clone. In the wizard that pops up I would select “An existing snapshot (powered off only)” and then selecting “Create a linked clone” on the next page. After giving the VM a name, the clone operation is completed nearly instantly. Relevant screenshots are included below:

These linked clones can be powered on and used as needed. The VMs have static IP addresses, so the way networking is configured I can only power on one linked clone copy at a time, but due to resource limitations on my laptop this hasn’t been a problem.

While we are talking about IP addresses, it is probably helpful to understand the topology that we’ve built. The following image should capture how the VMs interact with each other and the outside network.

As you can see, the lab-mgmt-01 virtual machine is serving as a gateway between the lab network and rest of the network. If we need to test something as if we are in an airgapped network, we can simply disable the NIC1 (Ethernet0) interface on the management server. Without this adapter working, the rest of the lab becomes isolated from the internet and any other services.

The parent VMs could have a few snapshots. As shown in the following example, one of the snapshots is in use and therefore is locked and cannot be deleted.

Once the test is complete, I could power off my two temporary clones and either delete them or keep them for a couple of days (in case I need to refer back to logs or such).

The next few posts will focus on configuration of the individual component VMs that make up this ‘smash lab’ environment.

Posted in Lab Infrastructure, Virtualization | 3 Comments

Which virtual machines have cloned vTPM devices?

In vSphere 7.0, when a virtual machine with a vTPM device is cloned, the secrets and identity in the vTPM are cloned as well. In vSphere 8.0 there is an option during a clone to replace the vTPM so that it gets its own secrets and identity (more information available here: Clone an Encrypted Virtual Machine (vmware.com)).

Someone recently asked me if it would be possible to programmatically find VMs that had duplicate key/secrets. I looked and found a Get-VTpm cmdlet, which returns a VTpm Structure that contains an Id and Key property. I suspected that the Key property would contain the key we were interested in, so I setup a quick test to confirm. Here is the output of a few VMs with vTPM devices showing the Id and Key values.

 Get-VM | Get-VTpm | select Parent, Name, Id, Key

Parent              Name        Id                             Key
------              ----        --                             ---
clone_unique        Virtual TPM VirtualMachine-vm-1020/11000 11000
clone_dupeVtpm      Virtual TPM VirtualMachine-vm-1019/11000 11000
New Virtual Machine Virtual TPM VirtualMachine-vm-1013/11000 11000

As we can see, the Key is actually the hardware device key of 11000 which is static, regardless of whether we expect a duplicate vTPM or not.

However, digging into ExtensionData I found some other more interesting properties, specifically EndorsementKeyCertificateSigningRequest and EndorsementKeyCertificate. Comparing the EndorsementKeyCertificate property confirmed that when a vTPM is duplicated this key is the same, but when it has been replaced it is unique. Taking that information into account, this one liner would group vTPMs by duplicate keys:

Get-VM | Get-VTpm | Select-Object Parent, @{N='vTpmEndorsementKeyCertificate';E={[string][System.Text.Encoding]::Unicode.GetBytes($_.ExtensionData.EndorsementKeyCertificate[1])}} | Group-Object vTpmEndorsementKeyCertificate

The output of this command would be a grouping per key. The Group property would contain all the VM names (aka Parent in this context) using the same key. In the example below, there is 1 VM with a unique key and 2 VMs sharing a key.

Count Name                      Group
----- ----                      -----
    1 52 0 56 0 32 0 49 0 51... {@{Parent=clone_unique; vTpmEndorsementKeyCertificate=52 0 56 0 32 0 49 0 51 0 48 0 32 0 51 0 32 0 50 0 49 0 57 0 32 0 52 0 56 0 3...
    2 52 0 56 0 32 0 49 0 51... {@{Parent=clone_dupeVtpm; vTpmEndorsementKeyCertificate=52 0 56 0 32 0 49 0 51 0 48 0 32 0 51 0 32 0 50 0 49 0 57 0 32 0 52 0 56 0...

Using this information we could remove/replace the vTPM in the duplicate VMs if needed to ensure a unique key. Note, per the documentation here, “As a best practice, ensure that your workloads no longer use a vTPM before you replace the keys. Otherwise, the workloads in the cloned virtual machine might not function correctly.”

Posted in Scripting, Virtualization | Leave a comment

VMware Skyline Insights API PowerShell Module

VMware Skyline is a proactive support tool to help customers avoid problems before they occur. More information on the service can be found here: https://www.vmware.com/support/services/skyline.html. One feature of this service is a GraphQL based API known as the Skyline Insights API. Using this API, you can query for active findings (the problems known/covered in the Skyline catalog) and for affected objects (the inventory items impacted by these findings). This was my first attempt at using a GraphQL based interface and it had a few learning curves.

The first learning curve with GraphQL was dealing with iterating through the results set. By default, the Insights API only returns 200 results per query. Once you have the first 200 records, if the query has more results you need to ask for the next batch of 200, and so on until you have all the results. This is easy enough to do, however the API will eventually start rate limiting queries against a Skyline Organization. Due to this, we also need some logic to account for these HTTP 429 rate limiting responses. As I attempted to solve these issues, I ended up creating a PowerShell module that would account for these lessons learned. The remainder of this post will cover how to use this new VMware.Skyline.InsightsApi PowerShell module.

To get started, you’ll need a API token. The process to create one is well documented here: https://blogs.vmware.com/kb/2021/12/skyline-insights-api-getting-started.html.

Second, you’ll need the module. All of the code for this is available in the VMware PowerCLI-Example-Scripts repo at https://github.com/vmware/PowerCLI-Example-Scripts/tree/master/Modules/VMware.Skyline.InsightsApi and also available in the PowerShell Gallery. The easiest way to install this module is with Install-Module VMware.Skyline.InsightsApi. This module contains 7 functions:

Get-Command -Module VMware.Skyline.InsightsApi

CommandType Name                             Version Source
----------- ----                             ------- ------
Function    Connect-SkylineInsights          1.0.0   VMware.Skyline.InsightsApi
Function    Disconnect-SkylineInsights       1.0.0   VMware.Skyline.InsightsApi
Function    Format-SkylineResult             1.0.0   VMware.Skyline.InsightsApi
Function    Get-SkylineAffectedObject        1.0.0   VMware.Skyline.InsightsApi
Function    Get-SkylineFinding               1.0.0   VMware.Skyline.InsightsApi
Function    Invoke-SkylineInsightsApi        1.0.0   VMware.Skyline.InsightsApi
Function    Start-SkylineInsightsApiExplorer 1.0.0   VMware.Skyline.InsightsApi

The first two functions listed (Connect-SkylineInsights and Disconnect-SkylineInsights) are used to connect to the API. The first requires an -apiKey parameter, which we obtained earlier. With this apiKey a global variable is created ($Global:DefaultSkylineConnection) containing the bearer token used to query the API. The second function simply clears out this global variable. As a safety mechanism, logic exists in the helper function to prevent the other functions from executing if this global variable is not present.

The Format-SkylineResult function is an optional function that helps with converting some of the objects returned by the API into strings. This is useful if you want to export the output into something like a CSV file. By default, if you attempt to pass the output from one of the other functions to a CSV, like Get-SkylineFinding | Export-Csv D:tmp\mySkylineFindings.csv many of the columns will end up with System.Object[] and the date values will be stored as long integers. If we also use this function, such as Get-SkylineFinding | Format-SkylineResult | Export-Csv D:tmp\mySkylineFindings.csv the objects are converted to strings (that can be separated by the value you pass to -separator) and the dates are converted to PowerShell dates.

The next function listed, Get-SkylineAffectedObject will return the list of affected inventory findings. It requires a -findingId and a -products input parameter to be passed in, either by property name or pipeline. A ‘product’ in this context is the case sensitive name of an endpoint in the Skyline Inventory, such as a vCenter Server name, Horizon Connection Server, vROps instance or the like. The ‘findingId’ is the case sensitive ID of the finding / issue that Skyline is aware of. Both of these properties, in the expected case, can be uncovered with the next function.

Next up, Get-SkylineFinding is a function that requires no input parameters, but does support three — the same -findingId and -products described above, as well as -severity. You can specify any number of these parameters, either by name or pipeline. The severity parameter is implemented client side, so all records are returned to the function, but the function will only return those matching one of the Skyline finding severities (Critical, Moderate, or Trivial). The output of this function can be piped to the above AffectedObject function, such as Get-SkylineFinding -severity:CRITICAL | Get-SkylineAffectedObject .

The Invoke-SkylineInsightsApi function is a proxy function that is consumed by both Get-SkylineAffectedObject and Get-SkylineFinding and is usually not directly consumed. It is exposed in the function for testing and any sort of future use. This is where much of the logic is implemented so that it can be shared by the two get functions.

Last, but not least, is the Start-SkylineInsightsApiExplorer function. This function will take the bearer token from the Connect-SkylineInsights function, put it in the clipboard, then launch the Skyline Insights API Explorer website in a web browser. From here, you can paste the bearer token into the ‘Request Headers’ area and interactively explore the GraphQL query for Skyline Findings.

I hope you find this module useful. If you have any feedback please leave a comment below or open an issue in the PowerCLI-Example-Scripts repo here: https://github.com/vmware/PowerCLI-Example-Scripts/issues.

Posted in Scripting, Virtualization | 1 Comment