Video Card Power Consumption & Savings

The primary system in my homelab is a Dell Precision 7920 Tower. I recently had the case opened and looking inside I saw the video card. This is a physically large Nvidia GeForce RTX 2080ti GPU, but for my purposes was overkill. The system typically runs virtual machine workloads that do little to nothing video related. I noticed that this GPU had extra power running to the card and it made me wonder how much extra power it would consume, even in an idle state.

I have a Kill A Watt monitor which can measure power usage, so I connected it up and powered on the system. It booted to ESXi and sitting in maintenance mode the system used between 160 and 180 watts of power, typically running at the lower end of that range.

I then removed the GPU and replaced it with a lower end MSI Geforce 210 (https://www.amazon.com/dp/B003XM568I) for $40USD. This card gets all its power from the PCI bus, no extra power input required. Checking this configuration with the same Kill A Watt, from the same maintenance mode state previously tested, I was using between 100 and 120 watts.

Using an electricity calculator, this savings of 60 watts running 24/7, at around $0.15USD/kWh, is a savings of $79USD. The ROI for this replacement video card is great, paying for itself in about 6 months. I had incorrectly assumed that the GPU wouldn’t be consuming much power in an idle state, but this test confirmed that significant energy savings could be realized with a minor change, not impacting the specific use case.

Posted in Lab Infrastructure | Leave a comment

Upgrade Issues using Aria Suite Lifecycle 8.16

I recently upgraded to Aria Suite Lifecycle 8.16.0 in a lab and ran into issues upgrading various products it managed.

Aria Operations for Logs (formerly vRealize Log Insight) was being upgraded 8.14.0 to 8.16.0, but failed with error LCMVRLICONFIG40004

Aria Automation Config (formerly vRealize Salt Stack Config) was being upgraded from 8.14.1 to 8.16.1, but failed with error LCMUPGRADEVSSC10102

Aria Automation (formerly vRealize Automation) was being upgraded from 8.14.0 to 8.16.1, but failed with error LCMVRACONFIG50008

Aria Operations for Networks (formerly vRealize Network Insight) was being upgraded from 6.11.0 to 6.12.1, but failed with error LCMVRNICONFIG90115 (platform) and LCMVRNICONFIG90114 (collector)

In all four cases, the same KB article saved the day: https://kb.vmware.com/s/article/95835. Removing older ciphers for the SSH service allowed Aria Suite Lifecycle to connect/start/validate upgrade progress.

It took a bit of troubleshooting to figure this out, but ultimately it was the detailed error from the Ops for Logs failure that helped the most: com.vmware.vrealize.lcm.vrli.exception.VrliInvalidHostException: Cannot execute ssh commands. Exception encountered : Session.connect: java.security.spec.InvalidKeySpecException: key spec not recognized. This led me to looking for KB articles related to SSH for Aria products. I found KB 95835, tested it quickly and was then on my way. When I ran into the subsequent errors with Automation & Automation Config, I tried this same KB article with similar success.

Posted in Lab Infrastructure | Leave a comment

TinyCore 15 Virtual Machine – very small VM for testing

For several years I’ve been using a couple of very small TinyCore Linux virtual machines for testing in my lab. These run very well in nested infrastructure and have a package to support open-vm-tools so you can interact with them like normal virtual machines (for example, cmdlets like Shutdown-VMGuest will interact with them). I was recently updating templates to TinyCore 14.0, when I realized that version 15.0 had just released (https://forum.tinycorelinux.net/index.php/topic,26861.0.html). I wanted to share the steps to create these new templates in this post.

I have a pair of these virtual machines, one has a GUI and the other is command line only. I primarily use the CLI version because it uses even less resources, but keep the GUI one around in case I need a test web browser available.

The Virtual Machine

When creating the virtual machine, I used the following options:

  • Compatible with: ESXi 6.7 U2 and later (vmx-15)
  • Operating System: Linux / Other 4.x or later Linux (32-bit)
  • 1 vCPU
  • 1 GB RAM
  • 1 GB disk (thin provisioned)
  • Expand Video card > Total video memory = 8MB (when using GUI, for CLI only I left it at the default 4MB)
  • VM Options tab > Boot Options > Firmware = BIOS

The Install

  • Power on VM
  • Open Remote Console (the one that launches VMRC or VMware Workstation, not the web console)
  • Attach to a local CorePlus ISO image
  • CTRL+ALT+INS to reboot
  • Select Boot Core with X/GUI (TinyCore) + Installation Extension
  • Click the installation button on the task bar. Select Frugal > Whole Disk > sda > install boot loader > ext4
  • Select either ‘core and x/gui desktop’ or ‘core only (text based interface)’ depending on which is appropriate.
  • Proceed
  • When the display says installation successful, Exit > Shutdown.
  • Power On VM (this will ensure that the CD is no longer connected and boot into the install)
  • At this point, we have a usable VM, but no VMware Tools.

Customization

I’ve done these customization steps a number of times, but this time I spent a few extra minutes to partially automate the process.

If booting into a GUI, I move the use the control panel > tcWbarConf and set the position to Left Vertical or Top Left. Having it at the bottom is sort of awkward, as sometimes when the screen resizes, this is left in the middle of the screen. Once positioned, I exit and select Exit to Prompt.

Since the VM doesn’t have/need SSH for my purposes, and copy/paste isn’t available, I placed the script and some associated files on a webserver. Once the system was online, I used wget to download the script, renamed the file, made it executable, and then ran the script. For a second version, I adjusted the script to expect a string coming in as a parameter, then used that value to set the hostname. I ended up with three files hosted on an internal web server:

rootca-example-com.crt the root certificate from my internal CA.

policies.json contains the entries needed for firefox to disable automatic updates and to trust the above root certificate. Its contents are below:

{
  "Policies": {
    "Certificates": {
      "ImportEnterpriseRoots": true, "Install": ["/usr/local/share/ca-certificates/rootca-example-com.cert"]
    },
  "DisableAppUpdate": true
  }
}

buildscript2.txt is the customization script. It’s stored as text as my webserver blocks .sh files by default and getting the text and renaming the file client side was an easy workaround. This script is actually responsible for downloading the two files above and putting them in the right filesystem locations.

if [ -z $1 ]; then
  echo "Please provide an argument which is used for hostname and other logic."
  exit 1
fi

# The following code will run for either case, gui or cli
sudo sed -i "s/sethostname box/sethostname $1/g" /opt/bootsync.sh

tce-load -wi ca-certificates curl pcre

sudo mkdir /usr/local/share/ca-certificates
sudo wget http://www.example.com/build/rootca-example-com.crt -P /usr/local/share/ca-certificates
sudo update-ca-certificates
echo "usr/local/share/ca-certificates" >> /opt/.filetool.lst

echo "/usr/local/sbin/update-ca-certificates" | sudo tee -a /opt/bootlocal.sh 
echo "/usr/local/etc/init.d/open-vm-tools restart" | sudo tee -a /opt/bootlocal.sh 


if [[ $1 == *"gui"* ]]; then
  # install firefox and open-vm-tools-desktop packages
  tce-load -wi firefox_getLatest open-vm-tools-desktop

  # deploy firefox policy to disable autoupdate and trust certs
  sudo mkdir -p /etc/firefox/policies
  sudo wget http://www.example.com/build/policies.json -P /etc/firefox/policies
  echo "etc/firefox" >> /opt/.filetool.lst

  # install firefox latest
  firefox_getLatest.sh

  # instead of loading the firefox_getLatest script, load actual firefox
  sudo sed -i 's/firefox_getLatest/firefox/g' /etc/sysconfig/tcedir/onboot.lst

else
  # install open-vm-tools package
  tce-load -wi open-vm-tools
fi

# the follow will run after all else to backup the config.
echo y | backup

The script above looks for gui in the hostname provided and if present installs firefox and the open-vm-tools-desktop. Running the above script was executed like this:

wget http://www.example.com/build/buildscript2.txt
mv buildscript2.txt buildscript2.sh
chmod +x buildscript2.sh
./buildscript2.sh

After running the script, the VM is ready to use. I typically shut down the VM and export it as an OVF or OVA that can be placed on an internal web server and deployed as needed. This creates a super tiny appliance — only 26mb for CLI and 229mb for GUI versions. Not bad for a fully functional OS, with GUI, a web browser, and trusting my internal CA out of the box.

Note: the above script/process has only been tested with TinyCore 14.0 and 15.0 releases.

Posted in Lab Infrastructure, Scripting, Virtualization | Leave a comment

Cannot configure identity source due to Type or value exists.

On vCenter Server 7.0u3p (aka 7.0.3.01800), I recently experienced an error “Cannot configure identity source due to Type or value exists.” when configuring Active Directory over LDAPS. The issue was caused by a duplicate certificate, but that fact was not immediately obvious.

To configured AD over LDAPS we must provide the certificate used by the domain controller. To obtain this certificate, the following KB article: https://kb.vmware.com/s/article/2041378 shows how to use openssl s_client to obtain the certificate on port 636 (LDAPS). Obtaining the certificates from each domain controller and presenting both to the “Edit Identity Source” screen (as shown below):

Would result in the following error:

Tailing the /storage/log/vmware/vmdird/vmdird-syslog.log file, we noticed an entry when saving the above configuration similar to:

2024-01-23T13:39:26.847703+00:00 err vmdird  t@140567635818240: InternalAddEntry: VdirExecutePostAddCommitPlugins - code(9619)
2024-01-23T13:39:26.848501+00:00 err vmdird  t@140567635818240: VmDirSendLdapResult: Request (Add), Error (LDAP_TYPE_OR_VALUE_EXISTS(20)), Message (Invalid or duplicate (userCertificate)), (0) socket (127.0.0.1)

The Invalid or duplicate (userCertificate) part of this error was interesting. After checking with the directory services folks, they confirmed they had placed the same certificate on multiple domain controllers, listing each domain controller name/IP in the subject alternative name (subjectAltName) field. When using openssl s_client to obtain the certificates, each DC returned the exact same value, which would explain a duplicate.

To work around this issue, we left both servers listed in the “Edit Identity Source” screen, but only provided a single certificate file. This change saved successfully and didn’t result in the ‘Type or value exists’ error message.

Posted in Lab Infrastructure, Virtualization | Leave a comment

vSphere ESXi Host Certificate Status Alarm bulk resolution

In the vSphere UI, some hosts will occasionally trigger an alarm of “ESXi Host Certificate Status”. VMware Skyline has a finding for this issue as well — vSphere-HostCertStatusAlarm. The resolution is typically straightforward, right click the host > certificates > renew certificate. However, if you have hundreds of hosts where this needs to happen it can be tedious to use the UI. This post will explore why the certificates expire & how to automate their replacement when needed.

By default, these certificates are issued by the VMware Certificate Authority (vmca). A cert is issued to the host when the host is added to vCenter Server. The validity period for the certificate is configured using a vCenter advanced setting. From Inventory > vCenter > Configure > Settings > Advanced Setting, the value for vpxd.certmgmt.certs.daysValid is the length of time that a renewed certificate will be requested.  The default should be 1825, which is 5 years. 

In this view you can also see the vpxd.certmgmt.certs.minutesBefore value.  This is the starting date for the certificate request.  The default 1440 value (24 hours) ensures that anything validating this certificate doesn’t think its not yet valid because it was too recently issued.  I mention this as it’ll be relevant in some of the examples below.

From Administration > Certificates > Certificate Management, the ‘VMware Certificate Authority’ tile shows the validity of the VMCA certificate.  This will be the max age that a new certificate can be valid.  For example, in my lab this value is Nov 9th, 2028.  This is a touch under 5 years, so even though I would request a 5 year validity period (vpxd.certmgmt.certs.daysValid setting above), this VMCA cert will expire prior to that and can only issue certs through this slightly shorter date.  This VMCA certificate is typically valid 10 years from when the VC is first deployed or the cert is replaced.  This cert can be regenerated with certificate-manager (https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.authentication.doc/GUID-E1D35792-ED03-468A-966B-362BED18021A.html), but doing so will restart all vCenter services and is more disruptive than the ESXi host certificate renewal process.

To figure out how to automate this functionality, I enabled code capture (Developer Center > Code Capture) and recorded the action of renewing the certificate from the UI. This showed me what the UI was doing to make the request and gave me a lead on “CertificateManager” being something I should search for.

Armed with that info, I was able put together a short PowerCLI script to query the cert values for all hosts and store the output in a variable so we can use it later.  In the output I included the moref of the host, because that would be needed if we want to call the CertMgrRefreshCertificates_Task that we identified using Code Capture.  CertificateInfo has additional properties such as issuer (the vCenter) and Subject, but those aren’t required for our exercise so I didn’t include them. 

# create a collection to store ESXi Host & certificate details
$myHostCertStatus = @()
foreach ($thisESX in Get-VMHost -State:Connected | Sort-Object Name) {
  $certMgr = Get-View -Id $thisESX.ExtensionData.ConfigManager.CertificateManager
  $myHostCertStatus += $certMgr.CertificateInfo | Select-Object @{N='VMHost';E={$thisESX.name}}, @{N='moref';E={$thisESX.extensiondata.moref}}, NotBefore, NotAfter, Status
}

With our new variable populated, we can look at the specific host from the UI action to cross reference our timestamps. Here is the event from the UI:

Here is the output from our variable, filtered down to only the host we are interested in.

$myHostCertStatus | Where-Object {$_.VMhost -match 'euc-esx-21'}

VMHost    : euc-esx-21.lab.enterpriseadmins.org
moref     : HostSystem-host-25598
NotBefore : 2/12/2024 8:31:07 PM
NotAfter  : 11/9/2028 6:51:22 PM
Status    : good

We can see that the NotBefore value is Monday the 12 at 8:31 PM (EST is -5 hours, so this is 24 hours prior to that 3:31 time from the screenshot above).  The NotAfter is shorter than the 5 years requested, because it follows the VMCA expiry date of Nov 9th.

We now have all the info we need — a way to validate current certificate status, the method we need to call to renew a certificate, and a host we are willing to test with. In the example below I’m filtering down our variable to only one test host and passing that item to the method identified from code capture. A task ID is returned by Power CLI.

$myHostCertStatus | Where-Object {$_.VMhost -match 'euc-esx-21'} | %{
  $thisCertMgr = Get-View -Id 'CertificateManager-certificateManager'
  $thisCertMgr.CertMgrRefreshCertificates_Task($_.moref)
}

Type Value
---- -----
Task task-3269126

If we look in the UI, we can confirm this task was executed and get the timestamp of the request.

Re-running the host query block from above to check this hosts output, we can see that the NotAfter value has not changed (it is still constrained by the VMCA validity) but the NotBefore value has been updated.

VMHost    : euc-esx-21.lab.enterpriseadmins.org
moref     : HostSystem-host-25598
NotBefore : 2/13/2024 1:34:23 PM
NotAfter  : 11/9/2028 6:51:22 PM
Status    : good

Now that we’ve confirmed the certificate has been replaced, and the expiration date aligns with our expectations, we can tweak this to look at the NotAfter or Status properties and run the same code on a larger block of hosts.

Posted in Lab Infrastructure, Scripting, Virtualization | Leave a comment