How I Doubled My Homelab CPU Capacity for $200: Xeon Gold 6230 Upgrade

In this post, I’ll walk you through how I solved a growing CPU bottleneck issue in my homelab by upgrading the CPUs in my Dell Precision 7920. I’ll share the process, challenges, and cost-effective solution that allowed me to double my system’s CPU capacity.

The primary system in my homelab is a Dell Precision 7920 tower. I purchased it on eBay with 2x Xeon 5222 CPUs and 512GB of RAM about 2 years ago, replacing a pair of older HP DL360 Gen8 rack mount systems. The older HP systems had a pair of E5-2450L CPUs, 8 cores/ea at 1.8ghz for a total of 28.8ghz per system… but these systems were primarily constrained by RAM and not CPU. Based on some rough math, I made the decision to go from a total of 32 cores at 1.8ghz to just 8 cores at 3.8ghz.

In the first ~6 months, everything was great. Neither CPU or RAM were bottlenecks, everything was running well. However, as I added more and more nested environments (including nested VCF) I started running into CPU contention. Last year (early 2024), I knew that this cluster CPU usage was high. I could see from Aria Operations that CPU demand was well above the usable capacity most of the time.

CPU Demand of 30-Greenfield cluster, taken in early 2024

Around that time I looked into replacement CPUs for this system. I attempted to drop in some very low cost Xeon Gold 6138 CPUs (1st Gen Scalable) as they were very inexpensive (around $50/pair). Unfortunately, these CPUs were not compatible with the RAM in this system. The memory configuration is 8x64GB 2933 MHz DIMMs, which really limited my CPU choices to only those which run at 2933MHz (based on the table on page 91 of the owners manual). The 2nd Gen Scalable CPUs were preferred, as they are not expected to be deprecated in the next major vSphere release (per https://knowledge.broadcom.com/external/article/318697/cpu-support-deprecation-and-discontinuat.html). I had decided the best two options would be the Xeon Gold 6238 (22cores/socket at 2.1ghz) or 6230 (20cores/socket at 2.1ghz). Around the time these CPUs were running about $500/ea (6238) or $350/ea (6230) from various eBay sellers. I decided to hold off on the replacement and instead turn off/on certain environments as needed instead of running them all the time.

A few weeks ago, when running most of my nested environments concurrently again, I was seeing high CPU use. I did a bit more research and confirmed that the 6238 and 6230 CPUs were still solid options for what I needed, but now the price had fallen to 350/ea (6238) or $95/ea (6230). The 6238 CPUs would provide a total of 92ghz of capacity, while the 6230s would deliver 84ghz. Given that the demand for the cluster is only around 45ghz, the lower cost 6230s were about 2x the capacity I needed like a solid option. I decided to pick up a pair of and get them switched out. In the chart below, you can see that a few days prior to the “Now” line, the usable capacity of this cluster more than doubled. Aria Operations now shows that we are >1 year remaining until CPU runs out.

CPU Demand of 30-Greenfield cluster, taken in Jan 2025 after replacing CPUs

Conclusion

I knew that CPU usage was high, and that the most obvious solution was to add additional capacity. Even after narrowing down the options to just two, helped primarily by memory constraints, having specific capacity history helped make the most cost-effective decision. Instead of spending $700 on a pair of 6238 CPUs, I was able to solve the issue with just $200 for a pair of 6230 CPUs. After making the change, reviewing the same chart confirmed that the issue is in fact solved.

Posted in Lab Infrastructure, Virtualization | 1 Comment

Getting Started with Bruno: A Beginner’s Guide to Simplifying API Interactions

I’ve recently been working with APIs more than ever. A colleague recently introduced me to Bruno, an offline, open-source API client. For the most part, I had been interacting with APIs using swagger UIs in product or with PowerShell’s Invoke-RestMethod. This is sometimes challenging, such as remembering complex URLs, managing headers, or handling authentication. Bruno provides a standalone GUI to help streamline some of these tasks. As I was getting up to speed with the interface, I reviewed several prior posts to connect to various APIs I was familiar with. The following notes are what I learned while getting started with Bruno.

Example 1: APC BackUPS

In a previous article we explored creating our own custom ‘API’ using a powershell HTTP listener. This was a very basic example as it required no authentication or special headers. Since this API is so simple, its an easy first example for Bruno.

  • Create new collection
  • Create new request
    • Name the request (getStats)
    • Specify the URL (http://servername.example.com:6545/getStats)
    • Run (the right arrow at the end of the URL).

The response should show our JSON body that we crafted in our script. Now if I need to run this again, I don’t have to remember the hostname/port number for my service, I can just hit go and get the current response.

Example 1b: Using a Bruno ‘environment’

In the top right of a Bruno collection, there is a drop down list that says ‘No Environment’. If we select that drop down and ‘Configure’ we can create a new environment. This environment is a place to store variables, like server names, ports, and credentials. For my example, I’m going to create an environment named ‘server room’. In the ‘server room’ environment, I’ll define a variable named apcBackupsHost with the value servername.example.com). With this environment variable defined, I can edit my URL to use this variable name, enclosed with a pair of squiggly braces as shown below:

If I had multiple hosts running this API, I could create different environments for each. That way I can toggle between them using the environment drop down list and not need to update any of my API calls. Using this environment functionality can help save time when working with different environments (e.g., production vs. staging) and they can help prevent errors when managing credentials or server names.

Example 2: VI/JSON

The next example comes from a prior post as well — Using VI/JSON with Powershell. VI/JSON was introduced in vSphere 8.0U1 as a way of accessing the vSphere Web Services SDK via a REST interface. To get started with this in Bruno, we’ll make a new collection with a POST request to login. We’ll also make an environment for this collection that has four variables:

  • VC = vCenter Server name or IP
  • vcVersion = the version used in our request (8.0.3.0 in my case)
  • username = the username used to connect to vCenter Server
  • password = the password used to connect to vCenter Server.

I’ve named my request ‘Login’ and set a few properties. First the URL is https://{{VC}}/sdk/vim25/{{vcVersion}}/SessionManager/SessionManager/Login, which contains two of the variables from my environment. The body of the login contains the other two variables, as pictured below:

In addition to the Body I’ve made two other tweaks to this request. You can see where tweaks have been made in the above screenshot… any tab with a change has an indicator count. I’ve outlined the specific changes below:

  • Headers:
    • Name = Content-Type
      • Value = application/json
      • Vars > Post Response:
      • Name: vmware-api-session-id
        • Expr: res.headers['vmware-api-session-id']

The post response variable says to take the vmware-api-session-id response header value and save it in a variable for future use, like in our next request.

My second request I named ‘Get VM’ and is a GET of https://{{VC}}/sdk/vim25/{{vcVersion}}/VirtualMachine/vm-31/config, where vm-31 is the managed object reference ID of a specific VM. For this request, I’ve set two headers, the content-type=application/json and vmware-api-session-id={{vmare-api-session-id}}, which uses the variable we retrieved from the login request as shown below:

With these two headers defined, we can send our request, and it will retrieve the configuration details of our specific VM.

If there is another request we need to make in this same collection, we can right click the name of our request (Get VM in this case) and clone the request. This will make a new request with the same customized values already populated. This allows us to simply change the URL and submit a different request. For example, if I want to get details about all my license keys, I can change the URL to https://{{VC}}/sdk/vim25/{{vcVersion}}/LicenseManager/LicenseManager/licenses. The headers are already populated so I can send the request (CTRL + Enter is the default key binding for this task) and we’ll have a JSON body showing all of our license keys.

Example 3: Aria Ops Casa API

Finally, in another previous post we looked at logging into the Aria Operations Casa API using an LDAP account. This is a bit more difficult as we needed to base64 encode a username:password string to pass as a header for authentication. Lets see if we can do the same in Bruno.

  • Create new collection Aria Ops Casa
  • Create new request Casa Login
  • Create new environment lab with three variables: vropsServer, vropsCasaLdapUser, vropsCasaLdapPass and enter appropriate values. For the password I checked the ‘secret’ checkbox.
  • For the request type we’ll select POST and for our URL we will enter https://{{vropsServer}}/casa/authorize
  • On the script tab, we’ll build a ‘pre request’ to do some of the heavy lifting for authentication. Specifically, we’ll use a built-in function to do base64 encoding of our username/password string and then set our request Authorization header using that string. Sample code below:
const btoa = require("btoa");
var b64login = "vrops-ldap " + btoa(bru.getEnvVar("vropsCasaLdapUser")+":"+bru.getEnvVar("vropsCasaLdapPass"));
req.setHeader("Authorization", b64login );
  • On the Vars tab we’ll update the post response section to create a new variable named accessToken and use the expression res.body.accessToken to get the accessToken property from the body of the response.

Running the above request should get our server name, username, and password variables and use them to connect to the API. We’ll then create a new variable with the token we need for future requests.

To check the Aria Operations cluster status, we’ll start a new API request. This request must run after the above request, which populates the accessToken variable.

We now have this collection saved so we can easily access it in the future. If we have additional Aria Operations instances, we can copy the environments (so that all the variable names come over) and then update the variable values accordingly. This gives us a quick drop down to select which Aria Operations environment to query so we don’t need to re-enter username & passwords every time.

Conclusion

Bruno makes quick work of firing off a simple API call. The collections and environments are useful, especially when we have many endpoints we may want to query. I can see including this application as part of my API toolkit and you should consider it too. More information about Bruno can be found in the official docs at https://docs.usebruno.com/introduction/what-is-bruno.

Posted in Lab Infrastructure, Scripting | Leave a comment

Are my ESXi hosts sending syslog to Aria Operations for Logs?

I was recently working on an issue where a query in Aria Operations for Logs was not returning an event that I really expected to be present. After a bit of troubleshooting, I found that the ESXi host was sending the logs to syslog, but a firewall was preventing the logs from being received. Reflecting on this, I realized that there were many possible failure scenarios where a host could be property configured, but something in the path could be causing problems. You can see some of the possible failure points in the image below, anywhere the log message has to traverse a firewall or forwarder are all suspects for problems.

As we can see above, some syslog topologies can be complex, and that introduces the possibility of failure. ESXi host firewalls, physical firewalls, and any log forwarding device can be a place where events are lost. I wanted to create a script to help identify some of these gaps which we’ll outline below.

Part 1 – Sending a Test Message

For this test, I wanted to use the esxcli system syslog mark command to send a message. To make this message easy to find in Aria Operations for Logs, I generated a GUID to send in the message and will be able to look for it later. Any unique string will work, but this is something easy enough to generate with each test. Also, in larger environments where good configuration management is happening, I may not need to test every host. I decided to add a bit of logic in the script to only test a percentage of available hosts.

$newGuid = [guid]::NewGuid().guid
$message = @{'message'="$newGUID - Test Message"}

$percent = Read-Host -Prompt "What percentage of Hosts should we review? " 

# For each random host, send a syslog message with esxcli
$sendResults = @()
$hosts = get-vmhost -State:Connected
$hostCount = [math]::Ceiling(( $hosts | Measure-Object).Count * ($percent / 100))
$hosts | Get-Random -Count $hostCount | Sort-Object Name | %{
  $esxcli2 = $_ | Get-EsxCli -V2
  
  $sendResults += $_ | Select-Object Name, @{N='SyslogServer';E={($_ | Get-AdvancedSetting -Name Syslog.global.logHost).Value}},
           @{N='SyslogMarkSent';E={$esxcli2.system.syslog.mark.Invoke($message)}}
}

The above code will create a custom object $sendResults that will contain all of the hosts where the test syslog message was sent. In the next section we’ll see which of those events made it to our Aria Operations for Logs instance.

Part 2 – Query the Aria Operations for Logs events API

To make sure our syslog ‘mark’ messages made it from ESXi to our centralized Aria Operations for Logs instance, we’ll use the API to query for logs containing the $newGuid value we sent from part 1.

The first couple of lines of this script take care of logging into the API. We then send an event query path, and make a hashtable of the hostname & timestamp string. This will allow us to index into our results to see when Aria Operations for Logs received our event. Finally, we’ll loop through all the hosts that received a test message in part 1 and get the event timestamp from our hashtable.

$loginBody = @{username='admin'; password='VMware1!'; provider='Local'} | Convertto-Json
$loginToken = (Invoke-RestMethod -uri 'https://syslog.example.com:9543/api/v2/sessions' -Method 'POST' -body $loginBody).sessionId
$myEvents = Invoke-RestMethod -uri "https://syslog.example.com:9543/api/v2/events/text/CONTAINS%20$($newGuid)?limit=1000&timeout=30000&view=SIMPLE&order-by-direction=DESC" -Headers @{Authorization="Bearer $loginToken"} 
$queryHt = $myEvents.results | Select-Object hostname, timestampString | Group-Object -Property hostname -AsHashTable

$finalResults = @()
foreach ($check in $sendResults) {
  $finalResults += $check | Select *, @{N='FoundInLogs';E={ $queryHt[$_.name].timestampString }}
}
$finalresults

If all goes as expected we should see all of our test hosts to have text in every column, with a ‘FoundInLogs’ column having a fairly current timestamp. Instead, we found this in our lab:

Name                        SyslogServer                 SyslogMarkSent FoundInLogs
----                        ------------                 -------------- -----------
h259-vesx-43.example.com    udp://192.168.45.73:514      true           2024-11-17 20
h259-vesx-44.example.com    udp://192.168.45.73:514      true
h259-vsanwit-01.example.com                              true
test-vesx-71.example.com    udp://syslog.example.com:514 true           2024-11-17 20

Above we observe two hosts without a value in ‘FoundInLogs’ and one that doesn’t even have a syslog destination configured. The first host does have syslog configured, but our test message was not received. Investigating this host specifically, we find that the host firewall rule allowing outbound syslog was not enabled, as seen in the screenshot below (where we’d expect the check box to be selected):

This was caused by unchecking that box so that the test would fail, just so we could check our script logic. The other host (a vSAN witness host) does not have a syslog destination defined at all. This happened to be a gap in how configurations where applied in this environment. This host exists outside of a cluster and we are managing this setting at a cluster level; its an oversite that is easily corrected. However, without testing we may not have uncovered these issues.

Conclusion

Automation can help ensure that not only are settings consistently configured across an environment but can also help prove that the end-to-end flow is working. Hopefully this can help identify logging problems before those logs are needed.

Posted in Lab Infrastructure, Scripting | Leave a comment

Leveraging VMware Aria Operations for Power Consumption Tracking

I’ve been looking for a good reason to try out the Aria Operations Management Pack Builder, ever since a peer of mine built one for Pi-Hole back in early 2023. One thing that I thought would be of particular interest was tracking the power consumption of my lab. This article will outline how I achieved this goal.

Getting the data

The first task for this project was finding a way to get the data on my power consumption. The majority of my lab gear plugs into a small APC BackUPS UPS, which provides enough power to handle the occasional blip, but not to run the lab for any meaningful amount of time. This UPS came with a serial/USB cable that could be managed with some software available for Windows. I don’t have a physical Windows system running 24×7 in my lab, but I do have a domain controller virtual machine that resides on local disk of a single system. Since the VM isn’t moving around with DRS, I was able to pass a USB device through to the VM. I added a new USB controller, then added a new USB device, and then my VM settings had the following entries:

With this USB device passed through to the VM, I was able to install the PowerChute Serial Shutdown application. In this application, there is a logging > data log configuration, where you can set how frequently the service should record data. I’ve set this to 5 minutes. With this configuration enabled, there is a text file that is updated in this folder: C:\Program Files\APC\PowerChute Serial Shutdown\agent\energylog that includes details on the relative load in percentage of the UPS as well as the calculated load in watts. This is a great start, we now have a source of data on our local filesystem.

Formatting the data as JSON

The Aria Operations Management Pack builder can point at an online JSON or XML API and return fields for use in our custom management pack. I had considered enabling a web server (like IIS) to serve up a dynamic page read the energylog and re-format the file as a JSON. This seemed like a bit of overkill, setting up IIS to server a single page. I then remembered seeing some code once to have powershell listen for http requests. I did a bit of searching and put the below code sample together. It listens for HTTP requests, then when one is received, powershell looks for the latest energylog file, finds the last row with the newest data, then returns that data as a JSON object.

$httpListener = New-Object System.Net.HttpListener
$httpListener.Prefixes.Add('http://*:6545/')
$httpListener.Start()

while($true) {
    $context = $httpListener.GetContext()
    $context.Request.HttpMethod
    $context.Request.Url
    $context.Request.Headers.ToString() # pretty printing with .ToString()

    # use a StreamReader to read the HTTP body as a string
    $requestBodyReader = New-Object System.IO.StreamReader $context.Request.InputStream
    $requestBodyReader.ReadToEnd()

    Get-ChildItem 'C:\Program Files\APC\PowerChute Serial Shutdown\agent\energylog\*.log' | ?{$_.LastWriteTime -gt (Get-Date).AddMinutes(-10)} | Sort-Object LastWriteTime | Select-Object -first 1 | Foreach-Object {

        # once we know the latest file, lets read the last line and split on the delimiter so we can assign the various parts to descriptive variables
        $contentParts = (Get-Content $_.FullName -Tail 1).Split(';')
        $responseJson = [pscustomobject][ordered]@{
            'HostName'               = $env:computername
            'ModelName'              = (Get-Content $_.FullName -TotalCount 10 |?{$_ -match 'modelname'}).Split('=')[1]
            'FormattedDate'          = (Get-Date).ToString('yyyy-MM-dd HH:mm')
            '2010Date'               = [int]$contentParts[0]
            'relativeLoadPercentage' = [int]$contentParts[2]
            'calculatedLoadWatts'    = [float]$contentParts[3]
        } | ConvertTo-Json
    } # end file loop

    $context.Response.StatusCode = 200
    $context.Response.ContentType = 'application/json'

    $responseBytes = [System.Text.Encoding]::UTF8.GetBytes($responseJson)
    $context.Response.OutputStream.Write($responseBytes, 0, $responseBytes.Length)

    $context.Response.Close() # end the response
} # end while loop

There is likely some room for improvement in that code, but as a proof of concept it seems to get the job done. I needed to open a Windows firewall port to allow incoming TCP 6545 requests, but we can test our ‘API’ from a remote machine easy enough, again using PowerShell:

Invoke-RestMethod -Uri 'http://servername.example.com:6545'

This should return the JSON object we created above. I created a scheduled task to start the above script when the system starts so that it is always running.

Management Pack Builder

For this part of the process, we’ll download the latest version of the Management Pack Builder appliance from https://marketplace.cloud.vmware.com/services/details/draft-vmware-aria-operation-management-pack-builder-1-1?slug=true and deploy it into our environment.

Our ‘API’ is super simple. Any request to the IP:Port will result in our JSON being returned. When building the management pack, we really don’t need much with authentication, special headers/request query strings, or object relationships. In fact, there are only three tabs where we need to fill in some details in Management Pack Builder, listed below.

Source

  • Hostname: the name of our test system with the USB connected UPS & script running.
  • Port: 6545
  • SSL Configuration: No SSL
  • Authentication: Custom, no details required since our ‘service’ doesn’t have authentication.
  • Global Request Settings (optional): no configuration required.
  • Test Connection Request: default GET, no configuration required.
  • Test Connection Request Advanced: no configuration required.
  • Test: Submit ‘Request’ button — should return the JSON object from script

Requests

I set the path to getStats just to have something listed. Since our API is very simple we don’t require specific paths or headers. By default the request name will have the same value. Using the test ‘Request’ button should again return our expected JSON payload. We can then save our requests.

Objects

Next we’ll create an object using the ‘Add New Object’ button.

  • Object Type: APC BackUPS
  • Change Object Icon: Pick an icon, there is one that sort of looks like a battery, I piecked that one.
  • Attributes from the API Request > expand ‘getStats’ > select all the attributes returned (except 2010date, I didn’t need that one)
  • I left the hostname, model name, formatted date as string properties and the Relative Load Percentage and Calculated Load Watts as decimal metrics.
  • Select object instance name: ‘Model Name’
  • Select object identifiers: ‘Model Name’ + ‘Host Name’

Thats it! We now have enough of the management pack builder fields populated to build our PAK file. From the build tab we select ‘perform collection’ then ‘build’ and use the Pack File download link to get our file which should be about 20MB.

Installing and Configuring our Management Pack in Aria Operations

From there we can install our PAK file in Aria Operations. Instead of setting up the ‘VMware Aria Operations Connections’ feature inside of the Managment Pack Builder, I just switched over to Operations and selected Administration > Integrations > Repository > Add and browsed to my recently downloaded PAK file.

After our integration is installed, we should see an ‘Add Account’ button. Selecting the link will take us to the ‘Add Cloud Account’ page where we can enter the name & hostname of our connection. Here I’ve entered “Server Room UPS” for the name and “dr-control-21.lab.enterpriseadmins.org” as the hostname. Since no username/password are required for our ‘API’, these will be the only required fields.

After a few minutes, we should start seeing data flow into the metrics of our object. I took this screenshot after a few weeks of collection. We can check this out on the Metrics tab of the object and watch our calculated load over time as shown below:

Conclusion

Just because a device doesn’t provide an API doesn’t mean we can’t make our own. Using a bit of custom code + Management Pack Builder allows us to report on almost anything.

Posted in Lab Infrastructure, Scripting | 1 Comment

Exploring VM Security: How to Identify Encrypted Virtual Disks in vSphere

I was recently looking at some virtual machines in a lab and trying to determine which had encrypted virtual disks vs. encrypted configuration folders only. This data is visible in the vSphere UI. From the VM list view we can select the ‘pick columns’ icon in the lower left near the export button (in vCenter Server 8 this is called Manage Columns) and select the checkbox for Encryption.

With this selected, we can see that 4 VMs are all encrypted.

However, if we dig a little deeper, we can see that one VM has the configuration files and the only hard disk encrypted, as shown below:

Another VM only has the first hard disk encrypted (note that Hard disk 2 does not show the word ‘Encrypted’ below the disk size).

And yet another VM only has encrypted configuration files and the hard disk is not encrypted at all.

This makes sense, as the virtual machine view does not list each virtual disk, only the VM configuration. We can encrypt only the configuration, but we can’t encrypt only a hard disk without also encrypting the configuration. This view shows that there is something going on with encryption, but for what I was looking for we’ll need to dig bit deeper.

Since I wanted to check each VMDK of each VM, that’s not something that is easily viewable in the UI without lots of clicking, so I switched over to PowerCLI. I found a blog post from a couple years back (https://blogs.vmware.com/vsphere/2016/12/powercli-for-vm-encryption.html) which mentioned a community powershell module (https://github.com/vmware/PowerCLI-Example-Scripts/tree/master/Modules/VMware.VMEncryption) to report on encryption. Browsing through the code, I saw a ‘KeyID’ property that is present on VMs and Hard Disks where the configuration is encrypted. I created a quick script to loop through all the VMs looking for either of these properties. I could have used the published module, but for this simple exercise it was easy enough to pick/choose the fields I needed.

$myResults = @()
foreach ($thisVM in Get-VM) {
  foreach ($thisVMDK in ($thisVM | Get-HardDisk) ) {
    $myResults += $thisVMDK | Select-Object @{N='VM';E={$thisVM.Name}}, @{N='ConfigEncrypted';E={ if($thisVM.extensionData.config.keyId.KeyId){'True'} }}, 
                @{N='VMDK Encrypted';E={if($_.extensionData.Backing.KeyId.KeyID){'True'} }}, @{N='Hard Disk';E={$_.Name}},
                @{N='vTPM';E={if($thisVM.ExtensionData.config.Hardware.device | ?{$_.key -eq 11000}){'True'} }}
  } # end foreach VMDK
} # end foreach VM

$myResults | Sort-Object VM | Format-Table -AutoSize

Our $myResults variable now contains a row for each virtual hard disk, showing the VM Name, whether or not the ‘Home’ configuration is encrypted, if the VMDK is encrypted, the Hard Disk Name, and if the system has a vTPM or not. By default, the output will sort all the VMs by name, and list all of the properties. However, if I needed a list of all the VMs that might have one or more encrypted VMDKs, I could use the following Where-Object filter.

$myResults | Where-Object {$_.'VMDK Encrypted' -eq 'True'} | Select-Object VM -Unique

This will result in a list of VM names, showing only two interesting VMs. The above screenshot from the UI showed four VMs with encrypted configs.

Hopefully this will be helpful if you are looking for encrypted VMs in an environment.

Posted in Scripting, Virtualization | Leave a comment