Exploring VM Security: How to Identify Encrypted Virtual Disks in vSphere

I was recently looking at some virtual machines in a lab and trying to determine which had encrypted virtual disks vs. encrypted configuration folders only. This data is visible in the vSphere UI. From the VM list view we can select the ‘pick columns’ icon in the lower left near the export button (in vCenter Server 8 this is called Manage Columns) and select the checkbox for Encryption.

With this selected, we can see that 4 VMs are all encrypted.

However, if we dig a little deeper, we can see that one VM has the configuration files and the only hard disk encrypted, as shown below:

Another VM only has the first hard disk encrypted (note that Hard disk 2 does not show the word ‘Encrypted’ below the disk size).

And yet another VM only has encrypted configuration files and the hard disk is not encrypted at all.

This makes sense, as the virtual machine view does not list each virtual disk, only the VM configuration. We can encrypt only the configuration, but we can’t encrypt only a hard disk without also encrypting the configuration. This view shows that there is something going on with encryption, but for what I was looking for we’ll need to dig bit deeper.

Since I wanted to check each VMDK of each VM, that’s not something that is easily viewable in the UI without lots of clicking, so I switched over to PowerCLI. I found a blog post from a couple years back (https://blogs.vmware.com/vsphere/2016/12/powercli-for-vm-encryption.html) which mentioned a community powershell module (https://github.com/vmware/PowerCLI-Example-Scripts/tree/master/Modules/VMware.VMEncryption) to report on encryption. Browsing through the code, I saw a ‘KeyID’ property that is present on VMs and Hard Disks where the configuration is encrypted. I created a quick script to loop through all the VMs looking for either of these properties. I could have used the published module, but for this simple exercise it was easy enough to pick/choose the fields I needed.

$myResults = @()
foreach ($thisVM in Get-VM) {
  foreach ($thisVMDK in ($thisVM | Get-HardDisk) ) {
    $myResults += $thisVMDK | Select-Object @{N='VM';E={$thisVM.Name}}, @{N='ConfigEncrypted';E={ if($thisVM.extensionData.config.keyId.KeyId){'True'} }}, 
                @{N='VMDK Encrypted';E={if($_.extensionData.Backing.KeyId.KeyID){'True'} }}, @{N='Hard Disk';E={$_.Name}},
                @{N='vTPM';E={if($thisVM.ExtensionData.config.Hardware.device | ?{$_.key -eq 11000}){'True'} }}
  } # end foreach VMDK
} # end foreach VM

$myResults | Sort-Object VM | Format-Table -AutoSize

Our $myResults variable now contains a row for each virtual hard disk, showing the VM Name, whether or not the ‘Home’ configuration is encrypted, if the VMDK is encrypted, the Hard Disk Name, and if the system has a vTPM or not. By default, the output will sort all the VMs by name, and list all of the properties. However, if I needed a list of all the VMs that might have one or more encrypted VMDKs, I could use the following Where-Object filter.

$myResults | Where-Object {$_.'VMDK Encrypted' -eq 'True'} | Select-Object VM -Unique

This will result in a list of VM names, showing only two interesting VMs. The above screenshot from the UI showed four VMs with encrypted configs.

Hopefully this will be helpful if you are looking for encrypted VMs in an environment.

Posted in Scripting, Virtualization | Leave a comment

Unlocking the Power of VMware.vSphere.SsoAdmin: Automated Reporting and Management

I’ve recently had a couple of questions around automated reporting or changes to the vCenter Server SSO Domain. I’ve seen mention of the VMware.vSphere.SsoAdmin PowerCLI module, but haven’t had a need to dig into it. This post will explore a couple of things that can be achieved with this module.

Installing the Module & Connecting to an SSO Server

The module is available in the PowerShell Gallery as well as in the PowerCLI-Example-Scripts Repo (https://github.com/vmware/PowerCLI-Example-Scripts/tree/master/Modules/VMware.vSphere.SsoAdmin). You can install it with the following syntax:

Install-Module VMware.vSphere.SsoAdmin -Scope:CurrentUser

Once the module is installed we can connect to an SSO server (this is my vCenter Server Appliance).

Connect-SsoAdminServer -Server lab-vcsa-12.example.org -User brian -Password VMware1! -SkipCertificateCheck

A successful connection should return some details about the name/Uri/user that is connected. The following few examples all depend on a successful connection.

Reporting on Group Membership

The first reporting task I was asked about was seeing which users were members of the vsphere.local Administrators group. We can do this by finding the group, then piping that to another cmdlet provided by this module.

Get-SsoGroup -name Administrators -Domain vsphere.local | Get-SsoPersonUser

Here is a sample output:

Name          Domain        Locked Disabled PasswordExpirationRemainingDays
----          ------        ------ -------- -------------------------------
Administrator vsphere.local  False    False                              -1
test1         localos        False    False                              -1
brian         example.org    False    False                              35
lop           localos        False    False                              -1

Changing the administrator@vsphere.local password

One request I received was around the ability to programmatically change the password for the administrator@vsphere.local account. We can do this with a single line of code:

Get-SsoPersonUser -Name administrator -Domain vsphere.local  |Set-SsoPersonUser -NewPassword VMware1!VMware1!

In the above example, we are finding a specific user (with Get-SsoPersonUser) then we pipe that output to Set-SsoPersonUser and specify our NewPassword value.

Once the password is changed, we can login to the UI or with Connect-ViServer to validate that our credentials are successfully updated.

Updating the Active Directory over LDAP Identity Source password

From time to time it may be necessary to update the username/password used to bind to an active directory domain in the vCenter identity sources list. If we have a small number of vCenter Servers, we could probably do this in the GUI as shown in the screenshot below:

However, for a large number of vCenter Servers, or frequent password rotation, automation may be helpful. Fortunately this module can help update this identity source as well.

Get-IdentitySource -External | ?{$_.name -eq 'example.org'} | 
Set-LDAPIdentitySource -Username 'EXAMPLE\svc-ldapbind-a' -Password 'VMware1!'

In the above example we are getting external identity sources only, then using where-object to filter to a specific identity source (this environment has multiple LDAPS directories which require different bind users) then set that identity source updating both the username and password values. This is actually better than the GUI! When we make the same change in the GUI we also need to provide the certificate. With this module we can update only the necessary values and leave the existing certificate. (Note: this module is also capable of updating the certificate if needed.)

Conclusion

The VMware.vSphere.SsoAdmin module is very powerful and worth a closer look.

Posted in Scripting, Virtualization | Leave a comment

Step-by-Step: Installing Ubuntu 24.04 on a Raspberry Pi for DNS and NTP

In my home network, I have a Raspberry Pi4 which provides DNS (pi-hole) and NTP (chrony). Its a device that I don’t touch often and is a ‘production’ type service — in my lab I don’t mind blowing up / breaking things… but this device needs to be stable. If DNS goes offline the family can’t stream shows and it’s a real production down sort of situation. Systems in my lab consume NTP from this device, and regular devices in my home network rely on it providing DNS (for ad blocking as well as conditional forwarding of lab domains to DNS servers in the lab). A few days ago, I noticed that this system was down — it wasn’t answering DNS requests and SSH/VNC wasn’t working. After power cycling the system, I was also no longer able to ping the device. After a bit of troubleshooting, I realized that the SD card used as boot media had failed. The system had been running 24×7 for ~5 years, logging DNS requests and such, probably more write IO than anyone should expect from a consumer SD card.

To resolve the issue I ordered a new SD card… but I realized that this system had about 5 years of various configurations. I’m going to attempt to document the configuration (at least what I remember about it) below.

OS Installation

The previous Raspberry Pi used the Raspbian OS with a GUI. However, I never really used the GUI and primarily access this system remotely. Since most other systems I manage use Ubuntu (specifically 24.04), I decided to install that OS using the server instructions from here: https://ubuntu.com/tutorials/how-to-install-ubuntu-on-your-raspberry-pi#1-overview.

I used the Raspberry Pi Imager for Windows, which allowed me to customize the username/password, hostname, etc of the OS so that it booted up and I could connect via ssh.

Once I was logged into the system, the first thing I did was make sure it was up to date using sudo apt update && sudo apt upgrade. This installed a bunch of updates, so I rebooted for good measure.

Lab Certificate

In rare cases, I’ll access something from my lab from the Raspberry Pi. To make this work without certificate warnings, I installed the lab CA certificate. This is just two commands, one to copy the file and another to update the certs.

sudo wget http://www.example.com/build/rootca-example-com.crt -P /usr/local/share/ca-certificates
sudo update-ca-certificates

Install extra packages

I had a handful of extra packages that I installed. I’ll discuss each of these later, but for now we’ll install them all in one pass.

sudo apt install sssd-ad sssd-tools realmd adcli chrony tinyproxy

Proxy Server

For some occasional testing, I’ll use a proxy server in my lab. This was running in a dedicated VM, but while I’m revisiting things, I decided to co-locate it on this appliance.

# configure proxy
sudo nano /etc/tinyproxy/tinyproxy.conf

# change LogLevel from Info to Warning
# Allow 192.168.0.0/16 by removing comment

sudo systemctl reload tinyproxy

NTP (chrony)

I prefer having NTP servers running on physical devices. Since I don’t have many of those in the lab, I use the Raspberry Pi as a locally accessible NTP server. I’m using the chrony service to do this and allow anything in the lab to query this device for time.

# configure NTP
sudo nano /etc/chrony/chrony.conf
# append the following comment / allow lines to the file
# Define the subnets that can use this host as an NTP server
allow 192.168.0.0/16

sudo systemctl restart chrony.service

Pi-Hole

The reason I first purchased this Raspberry Pi was to block ads on my home network using pi-hole.

curl -sSL https://install.pi-hole.net | bash

# create a custom config file for various forward/reverse domain forwarding:
sudo nano /etc/dnsmasq.d/05-custom.conf

# contents of above new file
server=/lab.enterpriseadmins.org/192.168.127.30
server=/lab.enterpriseadmins.org/192.168.32.30
server=/example.com/192.168.127.30
server=/example.com/192.168.32.30
server=/168.192.in-addr.arpa/192.168.127.30
server=/168.192.in-addr.arpa/192.168.32.30

# from web UI, restart resolver.
# Update pihole settings > DNS, change from recommended allow only local requests to 'permit all origins' so that all lab subnets can resolve names.

# enable php for non-pihole /admin locations
sudo lighttpd-enable-mod fastcgi fastcgi-php
sudo service lighttpd reload

# Create redirect page for / to /admin
echo '<head>  <meta http-equiv="Refresh" content="0; URL=/admin" /> </head>' | sudo tee /var/www/html/index.html

# Create 'get-hostname.php' file in /var/www/html as well, this is for Aria Ops management pack.  The contents of the file should be:
<?php echo '{"hostname":"' . gethostname() . '"}'; ?>

Active Directory Join

Most Ubuntu boxes in my lab are joined to Active Directory for common logins. I configured the same for the Raspberry Pi, although it is not really required.

# configure AD
echo '%lab\ linux\ sudoers ALL=(ALL) NOPASSWD:ALL' | sudo tee -a /etc/sudoers
sudo /usr/sbin/pam-auth-update --enable mkhomedir

sudo /usr/sbin/realm join lab.enterpriseadmins.org -U svc-windowsjoin --computer-ou "ou=services,ou=lab servers,dc=lab,dc=enterpriseadmins,dc=org"
sudo sed -i -e 's/^#\?use_fully_qualified_names.*/use_fully_qualified_names = False/g' /etc/sssd/sssd.conf
sudo systemctl restart sssd.service

Static IP

Once everything was configured/ready, I decided to put the device in service by changing the IP from the DHCP address originally obtained to the static IP address I have configured on most devices.

network:
 version: 2
 ethernets:
   eth0:
     match:
       macaddress: "dc:a6:32:aa:aa:aa"
     dhcp4: no
     addresses: [192.168.127.53/24]
     routes:
      - to: default
        via: 192.168.127.254
     nameservers:
         addresses: [192.168.127.53,192.168.32.53]

To make the new network settings active, we must apply those file changes with sudo netplan apply.

Cleanup

The Raspberry Pi Imager utility used cloud-init to do some customizations. This was running at each startup and left a few messages on the system console. Since we no longer need cloud-init after the system is online, we’ll just remove the package with:

sudo apt purge cloud-init

Conclusion

The Raspberry Pi in my lab has been running for about 5 years with little to no maintenance. Other than this one failed SD card things have been very reliable. The steps here are mostly notes for future reference if I need to rebuild the device again. Hopefully you’ll find the notes helpful.

Posted in Lab Infrastructure | Leave a comment

Automating Cluster Management with Aria Operations API

As part of routine maintenance, it is sometimes necessary to take an Aria Operations cluster offline. For example, it is recommended to take the cluster offline to perform backups (https://docs.vmware.com/en/VMware-Aria-Operations/8.12/Best-Practices-Operations/GUID-1D058B4A-93BA-44D1-8794-AE8E1B96B3E4.html).

Since most folks want to schedule backups, it is important to be able to leverage automation to take the cluster offline. There is an cluster management API document at https://ops.example.com/casa/api-guide.html that has some details on how to do this.

Authentication

When logging into this API, I provided the admin username/password combination. Here is an example of checking the cluster state using that method:

$creds = Get-Credential
(Invoke-RestMethod -URI https://ops.example.com/casa/sysadmin/cluster/online_state -Credential $creds).cluster_online_state_snapshot

However, I’d prefer to use a centrally managed service account in Active Directory for such tasks. The ability to do this was first introduced in vRealize Operations 8.6 (doc) and still exists in Aria Operations 8.18 (doc). It depends on a separate Active Directory configuration / definition than the one in the product UI. The links provided show where/how to configure this identity provider from the /admin interface. Here is a screenshot showing this configuration:

Once Active Directory is configured for admin operations, we need to change our API authentication slightly to be able to use it. In the original example, we provided our username & password as a powershell credential object. In this example, we’ll end up with an extra API call to authenticate, then use the resulting bearer token as a header when checking the status. A code sample is below, but you’ll notice the authorization header that passes vrops-ldap along with base64 encoded username (as an AD userPrincipalName), colon, and password to an authorize resource. That resource will return a token that we’ll provide as a header to check the cluster status.

$b64 = [System.Convert]::ToBase64String([System.Text.encoding]::ASCII.GetBytes("h267-opsbu@lab.enterpriseadmins.org:VMware1!"))

$authorize = Invoke-RestMethod -Uri 'https://ops.example.com/casa/authorize' -Method Post -ContentType 'application/json' -Headers @{Authorization="vrops-ldap $b64"; Accept='application/json'}

(Invoke-RestMethod -URI https://ops.example.com/casa/sysadmin/cluster/online_state -Headers @{Authorization="Bearer $($authorize.accessToken)"; Accept='application/json'} -ContentType 'application/json').cluster_online_state_snapshot

Taking the cluster offline

With the authentication sorted out above, we can now post to this API to take the cluster offline. You’ll notice that we set the state to offline and provide a reason why. The example uses the same bearer token that we created in the above example.

$body = @{ 'online_state'='OFFLINE'; 'online_state_reason'='Lets back this thing up.'} | convertto-json
Invoke-RestMethod -URI https://ops.example.com/casa/sysadmin/cluster/online_state -Body $body -Method POST -ContentType 'application/json'  -Headers @{Authorization="Bearer $($authorize.accessToken)"; Accept='application/json'}

The above example submits a request to take the cluster offline but returns immediately after doing so. In the URI we could provide a ?async=false so that our command waits until completion. Another option would be to submit an async request (default), then create a loop to periodically check the cluster state using the prior ‘get’ request until the cluster is offline. I prefer the periodic polling option, as you can code in your own counter/timing/failure logic as needed.

If you check out the docs at /casa/api-guide.html, you’ll also see examples of setting the “Show reason on maintenance page” checkbox via the JSON body.

Bring the cluster back online

After our maintenance / backup task is complete, we’ll want to bring the cluster back online. In this example we don’t need to provide a reason in our body.

$body = @{ 'online_state'='ONLINE'} | convertto-json
Invoke-RestMethod -URI https://ops.example.com/casa/sysadmin/cluster/online_state?async=false -Body $body -Method POST -ContentType 'application/json' -Headers @{Authorization="Bearer $($authorize.accessToken)"; Accept='application/json'}

In this example I’m using the ?async=false so that the API call doesn’t return until the cluster is back online. Again, we could opt to use the default async request and periodically poll the service if we’d like.

Conclusion

The casa API is very useful for automating cluster management tasks. This article focuses on a few examples related to cluster state changes and authentication, but the API supports many other things, like PAK file uploads, NTP & certificate management, and even the configuration of AD authentication. You should check out /casa/api-guide.html on an Aria Operations node for more examples.

Posted in Lab Infrastructure, Scripting, Virtualization | Leave a comment

Scaling Your Tests: How to Set Up a vCenter Server Simulator

I’ve recently been testing a handful of reporting tools against vCenter Server endpoints. I have several lab instances for various major releases, which allow me to test a wide range of configurations. However, there are some tests that are hard to simulate, like what if a cluster has 100 hosts, or one host has 200 VMs?

Years ago I remember stumbling on a vCenter Server simulator. I didn’t have a specific need for it at the time but with this recent testing I checked, and the tool still exists and is actively maintained here: https://github.com/vmware/govmomi/blob/main/vcsim/README.md. There is even a container image available. This article will show how to create a single VM that can help simulate many vCenter Servers for our various reporting requirements.

Server setup

I started with an Ubuntu 24.04 virtual machine (deployed from a template previously created with Packer, as described in this previous post). I then installed docker compose and make a few configuration changes to make docker a bit easier to work with.

# install docker compose, add our active directory user & root to the docker group
sudo apt install docker-compose-v2
sudo usermod -aG docker ${USER}
sudo usermod -aG docker root
# Log off/on for new group membership to become effective

# create a folder for some of our configs, make the docker group owner of that folder
sudo mkdir /data
sudo chgrp docker /data
sudo chmod g+srw -R /data 

# configure docker to not overlap network ranges and make ranges smaller than default
echo '{"default-address-pools":[{"base":"172.17.0.0/16","size":26}]}' | sudo tee --append /etc/docker/daemon.json
sudo systemctl restart docker

# create a folder for our simulator 
mkdir vcsim
cd vcsim

Sample vCenter Inventories

William Lam maintains a github repo with recordings of some lab inventories, available here: https://github.com/lamw/govc-recordings. I downloaded this repo and extracted the contents to /data/vcsim/sims, which provides 5 different lab environments from about 4 years ago.

This repo also contains instructions on how to save an existing inventory for use in the simulator. Getting ready to do some maintenance or decommission an environment? Might as well save that point in time snapshot of the inventory we can report against later.

Docker compose.yml

Looking at this file, you’ll see 3 instances of the vcsim container running. There are two generic inventories, controlled by parameters passed to command for things like number of hosts per cluster, port groups, and listening port. The third vCenter inventory is one of the saved inventories from the govc-recordings repo. I’m running multiple saved inventories, but in the interest of saving space have only included one sample below. Each vcsim container will get a container IP and listen on port 8989. In this example, nginx is handling all of the traffic coming in — we reference a reverse_proxy.conf file as well as a certificate folder. Those will be discussed later.

In the /data/vcsim/ folder, compose.yml has the following contents:

version: '2'
services:
  # This nginx web server will terminate at a wildcard SSL certificate, then use the first label
  # of the DNS entry to send requests to port 8989 on the associated container, for example:
  # https://small.vcsim.example.com:443 --> https://small:8989
  nginx:
    image: nginx:latest
    volumes:
      - /data/vcsim/reverse_proxy.conf:/etc/nginx/conf.d/default.conf
      - /data/vcsim/cert:/etc/nginx/certs
    ports:
      - "443:443"

  # Here are a few simulated VCs with progressively larger inventories
  small:
    image: vmware/vcsim:latest
    command: -api-version "8.0.3" -cluster 1 -dc 1 -folder 5 -host 8 -standalone-host 0 -l "0.0.0.0:8989" -vm 20
    restart: unless-stopped
  medium:
    image: vmware/vcsim:latest
    command: -api-version "8.0.3" -cluster 2 -dc 2 -folder 5 -host 16 -standalone-host 0 -l "0.0.0.0:8989" -vm 100
    restart: unless-stopped

  # The following examples are from a github repo
  wlam7:
    image: vmware/vcsim
    command: -load /simdata -l "0.0.0.0:8989"
    restart: unless-stopped
    volumes:
      - /data/vcsim/sims/vcsim-vcsa.primp-industries.local:/simdata

nginx reverse_proxy.conf

This nginx configuration will extract the subdomain from the request, for example the word small from small.vcsim.example.com and store it in a $sub variable. We will then proxy the request for each subdomain to the appropriate container, for example https://small.vcsim.example.com:443/ to the individual container of https://small:8989/. In DNS we only need to create a single cname record for *.vcsim.example.com that points to our container host. Any container that gets created will automatically be available via our wildcard subdomain.

I’ve not included the certificate files, but as you can see from the compose.yml and reverse_proxy.conf files, there is a directory at /data/vcsim/cert containing a PEM formatted certificate and private key named wildcard-vcsim-example-com.pem and wildcard-vcsim-example-com.key. This allows all our requests (from PowerCLI or the like) to have a valid certificate, while we only need to maintain a single certificate file.

server {
   # Listen for any HTTPS request
   listen 443 ssl;

   # Extract the subdomain name from 'SUB.vcsim.example.com'
   server_name ~^(?<sub>[^.]+)\.vcsim\.example\.com$;

   # Define the path to the certificate and key
   ssl_certificate /etc/nginx/certs/wildcard-vcsim-example-com.pem;
   ssl_certificate_key /etc/nginx/certs/wildcard-vcsim-example-com.key;

   # For any request, proxy to the container name on port 8989
   location / {
        resolver 127.0.0.11 valid=1s;
        proxy_pass https://$sub:8989;
   }
}

Bring Up

With our compose.yml, reverse_proxy.conf, certificate files, and simulated inventories in place, we are ready to startup the service. To this we only need to run a single command:

docker compose up -d

When the system restarts, these containers will restart automatically. If we want to check stats or logs, we can do so with the following:

docker stats
docker compose logs

When running the docker compose commands we need to be in the /data/vcsim/ folder, or specify that folder via arguments.

Conclusion

Using the above steps, I have a single lab VM with ~10 vCenter inventories… using less than 2GB of RAM and 2vCPU. There are of course some caveats, like no UI and some methods that don’t exist in the simulator, but for many test scenarios when you just need the SOAP API this works great.

Posted in Lab Infrastructure, Virtualization | Leave a comment