vSphere Custom Images & How to Compare Image Profiles

Occasionally there is a need to create a custom ESXi image as either an installable ISO or a depot/zip bundle. For example, when setting up a new host, you may wish to automatically include specific drivers for a particular network card or storage adapter. There are a variety of ways to do this.

PowerCLI Image Builder Cmdlets

PowerCLI has been able to create custom images for many years. In this example, I plan to combine the ESXi 8.0 Update 2 image from VMware with the HPE Server addon (from https://www.hpe.com/us/en/servers/hpe-esxi.html). This specific image combination is already available directly from HPE, but the steps to manually combine the bundles should be the same if the combination is not available, for example if we wanted to include 8.0u2x (where x is a lettered patch release).

The first step is to get our two files, the stock VMware image (VMware-ESXi-8.0U2-22380479-depot.zip) and the HPE addon (HPE-802.0.0.11.5.0.6-Oct2023-Addon-depot.zip). We will add both of these depots to a PowerCLI session using the following:

Add-EsxSoftwareDepot -DepotUrl '.\VMware-ESXi-8.0U2-22380479-depot.zip','.\HPE-802.0.0.11.5.0.6-Oct2023-Addon-depot.zip'

When these depots are added, the Depot Url will appear onscreen, its in the format zip:<localpath>depot.zip?index.xml). We’ll want to note the path listed for the HPE addon as we will use that again shortly. With these depots added we can now query for image profiles. Only the ESXi image will have profiles, but there are likely multiple versions and we want to see what is available.

Get-EsxImageProfile

Name                           Vendor          Last Modified   Acceptance Level
----                           ------          -------------   ----------------
ESXi-8.0U2-22380479-no-tools   VMware, Inc.    9/4/2023 10:... PartnerSupported
ESXi-8.0U2-22380479-standard   VMware, Inc.    9/21/2023 12... PartnerSupported

As mentioned, multiple versions are available, one has VMware Tools (standard) and the other does not (no-tools). We will make a copy of the standard profile

$newProfile = New-EsxImageProfile -CloneProfile 'ESXi-8.0U2-22380479-standard' -Name 'ESXi-8.0U2-22380479_HPE-Oct2023' -Vendor 'HPE'

We will now add all of the HPE addons to the copy of our image profile. This is where we’ll need that local depot path mentioned above.

Add-EsxSoftwarePackage -ImageProfile $newProfile -SoftwarePackage (Get-EsxSoftwarePackage -SoftwareDepot zip:D:\tmp\custom-image\HPE-802.0.0.11.5.0.6-Oct2023-Addon-depot.zip?index.xml)

In this example we added all of the packages from the depot, but we could have included only a subset of specific VIBs by name if desired. We could have also included other VIBs from different depots (for example, from a compute vendor AND other VIBs from a storage vendor).

With our custom image created, combining the VMware and HPE bits, we can now export as ISO or Bundle (ZIP). In this example I’ll export both. The Bundle (ZIP) will be used for some comparisons later.

Export-EsxImageProfile -ImageProfile $newProfile -ExportToIso -FilePath 'PowerCLI_ESXi-8.0U2-22380479_HPE-Oct2023.iso'
Export-EsxImageProfile -ImageProfile $newProfile -ExportToBundle -FilePath 'PowerCLI_ESXi-8.0U2-22380479_HPE-Oct2023.zip'

vCenter Image Managed Clusters

Starting in vSphere 7, there was an ability to manage hosts with a single image that can create a custom image in the web interface. The screenshot below is from the workflow that comes up when creating a new cluster, we just need to pick the values from the provided drop down lists.

Similar to the above PowerCLI example, we are going to create an image that combines the ESXi 8.0 U2 build with a specific HPE Vendor Add-on (802.0.0.11.5.0-6). Once the cluster creation is complete, the image can be exported from the UI. Select the elipises > Export > and select JSON (for a file showing the selections made), ISO (for an image that can be used for installation), or ZIP (for updating an existing installation). I’m going to download a ZIP to be used in the next step. This results in a file named OFFLINE_BUNDLE_52d9502b-7076-7cb2-49b9-cbee13c57f0a.zip.

Comparing Images

The above two processes attempted to create similar images with identical components (the same ESXi image & HPE addon). We may have a need to compare images like these… either by comparing the depot files or the depot file to a running ESXi host. This section will focus on those comarisons.

Since we have two ZIP archive files, the first inclination might be to simply compare the file size or MD5 checksum. However, if we look at the file size (lenght property below), we’ll notice that the files differ slightly in size. This difference can be explained by a number of things, such as the different strings used for various names.

Get-ChildItem PowerCLI*.zip,offline*.zip | Select-Object Name, Length

Name                                                       Length
----                                                       ------
PowerCLI_ESXi-8.0U2-22380479_HPE-Oct2023.zip            686582727
OFFLINE_BUNDLE_52d9502b-7076-7cb2-49b9-cbee13c57f0a.zip 686552303

What we really need to do is compare the VIB contents of these bundles to see if any files are missing or versions inconsistent. This can be easily completed in PowerCLI. The first step is to import these depots into our session, we can do that as follows:

Add-EsxSoftwareDepot PowerCLI_ESXi-8.0U2-22380479_HPE-Oct2023.zip,OFFLINE_BUNDLE_52d9502b-7076-7cb2-49b9-cbee13c57f0a.zip

With both bundles imported, we can check and see what image profiles we have available. We should see two — one from Lifecycle Manager and the other using the name specified in our PowerCLI example. In this step we’ll create a variable for each profile to be used later

Get-EsxImageProfile

Name                           Vendor          Last Modified   Acceptance Level
----                           ------          -------------   ----------------
VMware Lifecycle Manager Ge... VMware, Inc.    11/20/2023 4... PartnerSupported
ESXi-8.0U2-22380479_HPE-Oct... HPE             11/20/2023 5... PartnerSupported


$ipLCM = Get-EsxImageProfile -Name 'VMware Lifecycle Manager*'
$ipPCLI = Get-EsxImageProfile -Name 'ESXi-8.0U2-2*'

If we dig into the image profiles, we’ll find that each as a VibList property that contains the included VIBs. Digging deeper, we’ll see that each VIB has a Guid that combines the VIB name and version (ex: $ipLCM.VibList.Guid will return the list for one profile; a sample row would look like VMware_bootbank_esx-base_8.0.2-0.0.22380479). Now that we have a field that has details on the various VIBs, we can have PowerShell compare them. The first command below will likely return nothing, the second should return all VIBs from our bundle:

Compare-Object $ipLCM.VibList.Guid $ipPCLI.VibList.Guid

Compare-Object $ipLCM.VibList.Guid $ipPCLI.VibList.Guid -IncludeEqual

With the above, we can confirm that our two bundles (ZIP files) have the same contents.

Another question that I’ve heard is can we confirm that a running ESXi host matches this bundle or if any changes are required? One option to do this is with esxcli software profile update --dry-run (documented here: https://docs.vmware.com/en/VMware-vSphere/8.0/vsphere-esxi-upgrade/GUID-8F2DE2DB-5C14-4DCE-A1EB-1B08ACBC0781.html). However, that typically requires the new bundle to be copied to the host. Since we already have this bundle locally, and imported into a PowerCLI session, we can ask the ESXi host for a list of VIBs and do a comparison locally.

$esxcliVibs = (Get-EsxCli -VMHost 'test-vesx-71' -V2).software.vib.list.invoke()
Compare-Object $ipLCM.VibList.Guid $esxcliVibs.ID

The above example returns a list of VIBs from an ESXi host, then compares the ID value to the Guid from the imported image. If any discrepancies are identified, they’ll be listed. As with the above comparison of the two image files, we can add an -IncludeEqual switch to ensure that the command is actually returning (as it will return all of the VIBs instead of nothing).

Posted in Scripting, Virtualization | Leave a comment

Keeping Linux up to date with Aria Automation Config

I have several persistent Windows & Linux VMs in my lab. The Windows VMs get their OS updates via Windows Server Update Service (WSUS) managed by group policy. This works pretty consistently and keeps everything current. The Linux VMs are a mix of Photon OS 4 and 5 as well as Ubuntu 20.04, and every time I ssh in I see a notification that updates are available. If I have a few minutes I’ll usually take the opportunity to get current… but these VMs could run for weeks without an interactive ssh login, leaving some security risk on the table.

In my lab I have an Aria Automation Config (formerly known as Salt Stack Config) appliance that I’ll use occasionally. It has functionality to run scheduled jobs on managed minions, but I’ve not taken time to set that up — until today.

Inventory

The first step I wanted to tackle was an inventory of patches that are required for the various endpoints. Looking around at available out-of-box functions in Aria Automation Config, I found a command pkg.list_upgrades that looked promising. I ran it against all minions and then looked at the resulting out. The raw return is available as JSON, so I imported it into Powershell (Get-Content timestamp-return.json | ConvertFrom-Json) and started looking at the details. There is an item for each minion and that item contains a return note property which looks similar to this:

$json[0].return |fl


libpq5          : 11.22-0+deb10u1
python-urllib3  : 1.24.1-1+deb10u2
python3-urllib3 : 1.24.1-1+deb10u2

For my purposes I’m mostly interested in the count of the return items… the number of packages that need to be updated. To find these counts, I came up with the following oneliner:

$json | Select-Object minion_id, has_errors, @{N='NumUpdates';E={($_.return | Get-Member -Type NoteProperty | Measure-Object).Count}} | ?{$_.NumUpdates -ne 0} |Sort-Object minion_id

minion_id                              has_errors NumUpdates
---------                              ---------- ----------
dr-extdns-21.lab.enterpriseadmins.org       False          8
h135-linux-01.lab.enterpriseadmins.org      False          7
h135-linux-02.lab.enterpriseadmins.org      False        105
net-cont-21.lab.enterpriseadmins.org        False         75
net-wanrtr-01.lab.enterpriseadmins.org      False         46
raspberrypi                                 False          3
saltmaster                                  False         34
svcs-cont-21.lab.enterpriseadmins.org       False         61

Looking at this list I can see that my updates are all over the place, some are fairly current, others are way off.

Apply the Updates

To apply updates to Linux, I first created a small state file called updates/os.sls and entered the following contents:

update_pkg:
  pkg.uptodate:
    - refresh: True

I ran applied this state to an Ubuntu box, and once complete logged in to check the system to make sure it was fully patched. Running an apt list upgradable I found some missing updates. I re-applied the state file, hoping that another round of updates would get me current, but the same updates were still missing. After doing some searching online, I found that I needed to include an extra line in my state file to include dist_upgrade so version 2 looked like this:

update_pkg:
  pkg.uptodate:
    - refresh: True
    - dist_upgrade: True

Applying this revised state to the Ubuntu VM got it all matched up with the apt list upgradable results. I then expanded this task to a few more VMs, but ran into a problem when applying the revised version 2 of the state to a Photon OS VM. The results of the Photon VMs showed an error message of: No such option: --dist_upgrade , which is the additional line I had just added to get better patching coverage.

I could have created two different state files, jobs, and associated schedules — one to target Ubuntu and another for Photon, using both of the above versions of the state file, but I wanted to have a more generic / shared update process for all Linux systems. Instead of creating duplicate states/jobs, I decided to add some if logic to my state file. The out of the box sse/apache/init.sls showed a perfect example of how to accomplish this. For the final version of my state file, I look to see if the OS is Photon, and if so apply the pkg.uptodate without the dist_upgrade flag, otherwise I do include it. The syntax looks like:

update_pkg:
{% if grains['os'] == 'VMware Photon OS' %}
  pkg.uptodate:
    - refresh: True
{% else %}
  pkg.uptodate:
    - refresh: True
    - dist_upgrade: True
{% endif %}

Now when I apply this state to a Linux VM, it updates both system types without error. The sample apache state also has some else if examples if we need to get more specific in the future… for example, pulling in Windows patching with the same state. For now, the single if/else works just fine.

Report on the Updates

The output from the above job which applies the updates/os.sls state returns a JSON body similar to our first reporting task. I tried importing this JSON into Powershell to see what sort of interesting reporting was available. One set of data that we can see is the list of packages which were updated, including the old & new version numbers. Here is an example from one minion:

$json[0].return.'pkg_|-update_pkg_|-update_pkg_|-uptodate'.changes


apt                          : @{new=2.0.10; old=2.0.6}
ufw                          : @{new=0.36-6ubuntu1.1; old=0.36-6}
bolt                         : @{new=0.9.1-2~ubuntu20.04.2; old=0.8-4ubuntu1}
<list truncated for readability>

For comparison to the initial report of which updates were needed, I also wanted to summarize what was updated to see if my counts were similar. The following output is from an initial test run on a subset of minions:

$json | Select-Object Minion_id, has_errors, @{N='NumPkgs';E={ ($_.full_ret.return.'pkg_|-update_pkg_|-update_pkg_|-uptodate'.changes| Get-Member -Type NoteProperty | Measure-object).Count }} | Sort-Object minion_id

minion_id                              has_errors NumPkgs
---------                              ---------- -------
dr-extdns-21.lab.enterpriseadmins.org       False       8
h135-linux-01.lab.enterpriseadmins.org      False       7
h135-linux-02.lab.enterpriseadmins.org      False     105
net-wanrtr-01.lab.enterpriseadmins.org      False      46

Comparing the above output to the initial report we can see that all available updates were applied.

I’ve since created a schedule to run each of these jobs on a regular basis. For now, the inventory job will run daily so that I can see/track progress, and the update job will run every weekend. Hopefully the next time I ssh into a Linux VM I won’t be presented with a laundry list of required updates.

Posted in Lab Infrastructure, Virtualization | 1 Comment

Easy wildcard certificate for home lab

In my home lab I have a container running Nginx Proxy Manager (discussed in this previous post). This proxy allows for friendlier host names and SSL for various services in the lab. Using a wildcard DNS record and wildcard SSL certificates makes for a super easy way to onboard new services.

To get started, I first needed to pick a parent domain name to use for services. I already have a DNS zone for example.com so I decided to put these services under apps.example.com. To make this super easy to manage, I created a new domain under example.com with the name apps. It has a single CNAME record of asterisk (*) and the FQDN points to the host name of my container host. Screenshot of the DNS record from my Windows DNS server below:

I then created a wildcard certificate for *.apps.example.com from my internal CA. There are many ways to create a certificate signing request (CSR), but since I have a lot of Aria Suite products in the lab, I like using the Aria Suite Lifecycle > Locker > Certificates > Generate CSR button. This gives me a UI to populate the fields and kicks out a single file with both the CSR & private key. I use the CSR to generate a web server certificate from my internal CA, then download the base64 certificate. I edit the resulting .cer file and append my CAs public key to create a proper chain. Now that I have a certificate and private key, I can move into the Nginx Proxy Manager UI.

From the Nginx Proxy Manager UI, I select the SSL Certificates tab. I add a new SSL certificate and populate the required fields, screenshot below:

When I go to the Hosts > Proxy hosts tab I can now very easily add hosts with SSL capabilities. I no longer need to make a certificate for each service or even manually create DNS records. For example, lets say my internal IPAM solution needs a certificate. Instead of creating a ‘friendly’ DNS record and dedicated certificate, I can use this Nginx Proxy and wildcard certificate. We can simply add a new proxy host, enter a domain name such as ipam.apps.example.com, enter the correct host/port details, and select the correct certificate.

On the SSL tab of the new host we can pick our wildcard.apps.example.com certificate and select force SSL.

Now when I browse to http://ipam.apps.example.com/, I’m automatically redirected to the secure version of the site:

This does inject a new dependency — the Nginx Proxy Manager container needs to be running for me to reach these secure services — but in this case the container is running on a host that is typically online/working.

Posted in Lab Infrastructure, Virtualization | Leave a comment

Synology iSCSI LUN Thin Provisioning space allocation

I recently did some performance comparisons between NFS and iSCSI on a Synology DS723+ NAS. When creating the iSCSI LUNs from within the Synology, there is a ‘Space Allocation’ drop down that has two values: Thick Provisioning (better performance) and Thin Provisioning (flexible storage allocation). When selecting Thin Provisioning, a checkbox appears to enable space reclamation. I’ve done some basic testing at a small scale and observed that both thick & thin devices perform similarly. However, I wanted to dig a bit deeper into the space reclamation functionality.

I suspected this reclamation functionality would map to the VAAI block storage UNMAP primitive, described in detail here: https://core.vmware.com/resource/vmware-vsphere-apis-array-integration-vaai#sec9410-sub6. This core.vmware.com article includes sample commands that you can use from the ESXi host perspective to verify thin provisioning status (esxcli storage core device list -d naa.624a9370d4d78052ea564a7e00011030) and if delete is supported (esxcli storage core device vaai status get -d  naa.624a9370d4d78052ea564a7e00011030). Running these commands worked as expected. All thin provisioned LUNs would show Thin Provisioning Status: yes using the first command, and only if the ‘space reclamation’ box was selected would the second command return Delete Status: supported.

As expected, deleting a VM from an iSCSI datastore would be immediately reflected in the usage of the datastore from the ESXi perspective, but not from the NAS perspective. Waiting for automatic VMFS UNMAP would eventually cause the NAS perspective to catch up. To see immediate results, I was able to run esxcli storage vmfs unmap -l nas03-iscsi-lun1 from the ESXi Host. As you can see in the image below, after deleting a VM from the datastore, vSphere was showing 58.2GB used, while the same LUN on the NAS was showing 77GB used.

Running the esxcli storage vmfs unmap -l nas03-iscsi-lun1 command immediately caused the Synology side of the view to update to very similar numbers:

This automatic UNMAP is not new, but was interesting to see in practice. For the last several years I’ve been using NFS datastores in my lab, where deleting a file from the datastore deletes the same file from the NAS filesystem directly, so I haven’t looked closely at the functionality. A more thorough article describing the capabilities and settings can be found here: https://www.codyhosterman.com/2017/08/monitoring-automatic-vmfs-6-unmap-in-esxi/. The article is several years old, but I tested many of the commands using an 8.0u1 ESXi host and had very similar output.

Posted in Lab Infrastructure, Virtualization | Leave a comment

Storage Device related Network Switch Issues?

I recently resolved a rather odd issue in my home lab.  I don’t fully understand how the symptoms and solution line up, but am writing this article for my future self… if the issue returns I’ll at least have notes on the previous steps. If you end up finding this article from a search, please feel free to leave a comment, I’d be curious to know if you were able to resolve a similar issue.

A couple of months ago, I wanted to clean up some unused connections to a physical switch (tp-link JetStream T1600G-28TS 24-Port Gigabit Smart Switch with 4 SFP Slots).  This is an access switch for some gigabit interfaces, has several VLANs configured on it, and does some light static routing of locally attached interfaces.  As part of this cleanup, I wanted to reconfigure a few switch ports so that devices with dual connections plugged in next to each other and the unused ports on the switch were contiguous. Not super important but figured it would be an easy cleanup task & free ports would be easier to see in the future.  For some reason however, I couldn’t reach the web management interface of the switch.  I tried to ping the management interface, but that failed as well.  I then tried to ping some of the default gateways for the locally attached interfaces this device was using for routing, but those failed to respond as well.  This was odd as the switch was successfully routing & connected devices were communicating fine.

As an initial troubleshooting step, I decided to simply power cycle the switch, assuming that some management functionally was degraded.  The switch came back online and devices attached to it were working as expected, however I still wasn’t able to manage/ping the switch.  I assumed that the switch may have failed but wanted to troubleshoot a bit more.  I thought perhaps one of the devices physically connected to the switch could have been the cause, so one at a time I physically disconnected active adapters, waited a few seconds and when the switch still wasn’t accessible, reconnected the cable.  This made me think that the issue wasn’t a specific NIC, but maybe an attached device/host (which may have had a pair/team of network adapters on the switch) was causing the problem.  I set out to power down all attached equipment, disconnect all devices, power cycle the switch, and confirm that the switch could be managed/pinged with only one management uplink attached.

While doing the power down process I had a continuous ping of the physical switch management address running in the background.  As soon as a specific ESXi host was shutdown, the switch started responding to ping.  I made a note of this host (a Dell Precision 7920 named core-esxi-34), but continued powering down everything in the lab, so that I could power it back on one item at a time to see when/if the problem returned.   I fully expected turning on core-esxi-34 was going to cause the problem to return… but it did not.  I powered on some VMs in the lab & did other testing, but the problem seemed to have cleared up.  Thinking this was a one-time issue, solved by a reboot, I went ahead and reconfigured switch ports, moved devices, and completed the physical switch cleanup exercise. Not as easy to complete as I had originally guessed, but complete none the less.

I wanted to keep an eye on the situation, so the next day I tried to ping the switch again, but the symptoms had returned — I was no longer able to ping the physical switch management interface.  I decided to start troubleshooting from the ESXi host that was noted during the earlier cleanup.  I looked through a few logs and found a lot of entries in /var/log/vmkernel.log.  These entries appeared to be SCSI/disk related, and not the network issue at hand, but I made note of them anyway as they were occurring about 10x per second and it was an issue that needed to be investigated.

2023-08-22T14:21:07.637Z cpu8:2097776)nvme_ScsiCommandInternal Failed Dsm Request

After seeing them spew by I caught one entry that looked a bit different:

2023-08-22T14:21:07.639Z cpu12:2097198)HPP: HppThrottleLogForDevice:1078: Error status H:0xc D:0x0 P:0x0 . from device t10.NVMe____Samsung_SSD_970_PRO_512GB_______________S469NF0K800877R_____00000001 repeated 10240 times, hppAction = 1

But the 'Failed Dsm Request' entries surrounded this specific line. Since the logs were pointing at a specific storage device, I decided to check and see if any VMs were powered on stored on that device and found just one. I powered down the VM in question, but the log spew didn’t immediately stop. A few minutes later however, the continuous ping of the switch started replying. I checked the vmkernel.log again and noticed the Failed Dsm Request spew had stopped, followed by lines specifically related to that storage device:

2023-08-22T14:35:54.987Z cpu1:2097320)Vol3: 2128: Couldn't read volume header from 63d77d1e-8b63fe7c-04f6-6c2b59f038fc: Timeout
2023-08-22T14:35:54.987Z cpu1:2097320)WARNING: Vol3: 4371: Error closing the volume: . Eviction fails: No connection
2023-08-22T14:36:03.044Z cpu13:2100535 opID=55002bc9)World: 12077: VC opID sps-Main-42624-252-714962-a0-e0-14c9 maps to vmkernel opID 55002bc9
2023-08-22T14:36:03.044Z cpu13:2100535 opID=55002bc9)HBX: 6554: 'local-34_sam512gb-nvme': HB at offset 3211264 - Marking HB:
2023-08-22T14:36:03.044Z cpu13:2100535 opID=55002bc9)  [HB state abcdef02 offset 3211264 gen 28113 stampUS 96099954819 uuid 64e3507b-6256ea04-091a-6c2b59f038fc jrnl <FB 15> drv 24.82 lockImpl 3 ip 192.168.127.34]
2023-08-22T14:36:03.044Z cpu13:2100535 opID=55002bc9)HBX: 6558: HB at 3211264 on vol 'local-34_sam512gb-nvme' replayHostHB: 0 replayHostHBgen: 0 replayHostUUID:  (00000000-00000000-0000-000000000000).
2023-08-22T14:36:03.044Z cpu13:2100535 opID=55002bc9)HBX: 6673: 'local-34_sam512gb-nvme': HB at offset 3211264 - Marked HB:
2023-08-22T14:36:03.044Z cpu13:2100535 opID=55002bc9)  [HB state abcdef04 offset 3211264 gen 28113 stampUS 96129744019 uuid 64e3507b-6256ea04-091a-6c2b59f038fc jrnl <FB 15> drv 24.82 lockImpl 3 ip 192.168.127.34]
2023-08-22T14:36:03.044Z cpu13:2100535 opID=55002bc9)FS3J: 4387: Replaying journal at <type 6 addr 15>, gen 28113
2023-08-22T14:36:03.046Z cpu13:2100535 opID=55002bc9)HBX: 4726: 1 stale HB slot(s) owned by me have been garbage collected on vol 'local-34_sam512gb-nvme'

Unsure how the two issues could be related (a storage error from an ESXi host causing an issue with a physical network switch), I decided to try and move everything off this 512GB NVMe device, delete the VMFS, and not use if for some time and see if the issue returned. The next time I checked the issue had returned. Upon further investigation of the vmkernel.log I found a similar log entry, surrounded by the familiar Failed Dsm Request spew:

2023-08-25T12:00:24.916Z cpu1:2097186)HPP: HppThrottleLogForDevice:1078: Error status H:0xc D:0x0 P:0x0 . from device t10.NVMe____Samsung_SSD_970_PRO_1TB_________________S462NF0M611584W_____00000001 repeated 5120 times, hppAction = 1

This is a different 1TB NVMe device in the host, which had a VMFS volume with running VMs on it. I wanted to see if this NVMe device had any firmware updates available, but I wasn’t able to find firmware for a 970 Pro device, only 970 EVO and 970 EVO Plus. For grins, I did boot the host to a USB device with these firmware updates & it confirmed that the update did not detect devices to update.

As I was looking at these two NVMe Disks, I did notice one thing that I thought was sort of strange — I had expected each device to have its own vmhba storage adapter, which is somewhat typical for NVMe. However, in my case, both of these devices were attached to vmhba2 per the following screenshot.

I did a bit of searching on the Intel Volume Management Device and realized that it could provide RAID1 for NVMe. With this particular setup however, these local NVMe devices have short term test VMs on them that I want to perform well, but don’t really care if the data is lost. As an aside, I checked after the fact and confirmed the host had a software package for Intel NVME Driver with VMD Technology (version 2.7.0.1157-2vmw.703.0.20.19193900) installed. I only mention it as it could be relevant and wanted to keep it for my notes as this issue may have been fixable using the VMD controller. I found an article (Precision 5820 Tower, Precision 7820 Tower, Precision 7920 Tower NVMe Drives Do Not Work in Legacy Mode | Dell US) which had some steps for disabling VMD via BIOS > Settings > System Configuration > Intel VMD Technology and deselecting / disabling all the options on that screen. I took this step, but when the host came back online, the local-34_sam1tb-nvme VMFS was inaccessible & it had a couple of VMs on it that I was in the middle of using. I backed out the change and when the host came back online the VMFS was accessible. I migrated the VMs on this datastore to other temporary storage and then reimplemented the changes. After disabling the Intel VMD, the system shows fewer vmhba devices, and each NVMe device is on its own SSD controller. The device type also changed from SCSI to PCIE as pictured below.

I created new VMFS volumes, one on each NVMe device, and moved some VMs back to them to generate IO. I’ve subsequently created new VMs, generated a lot of IO, deleted temporary VMs and generally put the devices through their paces. I have not seen the issue with the management address of the physical switch since. The problem used to return in a few hours/days, but this updated configuration has been running for over a week and has been stable.

Summary: I suspect that disabling Intel Volume Management Device (VMD) for these NVMe devices resolved this issue. However, it is possible there was some sort of VMFS corruption on the devices and deleting/recreating the filesystem was the fix. Additionally, I did not investigate the possibility of a newer/OEM driver for the Intel VMD controller. I wanted to capture these notes in this post; if the issue returns, I’ll I have an idea where to start troubleshooting next time.

Posted in Lab Infrastructure, Virtualization | Leave a comment