Extending Aria Automation with Custom Resources and Actions for IP Address Management

In my lab, I leverage Aria Automation to deploy Linux, Windows, and nested ESXi VMs. This is my primary interface for requesting new systems and covers most of the common resources I need for testing. However, I sometimes deploy one off appliances and such, at a scale where automation hasn’t been built. These appliances typically require an IP Address and DNS record. I had previously created a Jenkins job that accepted parameters, making these easy enough to create, but the cleanup is where I would fall down. I also wasn’t a huge fan of switching between the Aria Automation & Jenkins consoles to submit these requests.

My ideal solution to both of these problems was an Aria Automation request form that would create a deployment tracking these one-off IP requests. To not re-invent the wheel, this Aria Automation request could simply call Jenkins. When testing is complete, I’d have a deployment remaining in Aria Automation to serve as a reminder to properly clean up IPAM and DNS. This article will cover the process of creating this action, resource, and template to front end the Jenkins request with Aria Automation.

Custom Action – Create

In Aria Automation Assembler > Extensibility > Actions, we can create a new action. I named mine IPAM Next Address Create and selected only the project where my test deployments live.

For the action, I’m writing everything in PowerShell, since I already know that language and Aria Automation supports it. This code sample lacks robust error handling, and probably could be cleaned up a fair amount, but it got the job done for what I was hoping for. In a production environment, adding some logic after each step to ensure the task completed would be prudent. In the event that the IPAM service is down or Jenkins isn’t responding, we’d want the request to behave in a predictable way.

The create section has more code as it connects to phpIPAM to get the next address then requests a DNS record be created by Jenkins. I directly obtain the IP address, so that I have it to return as part of the deployment, so that the IP obtained is clearly visible in the deployment.

function handler($context, $inputs) {
    $subnet = $inputs.subnet
    $hostname = $inputs.name
    
    write-host "We've received a $($inputs.'__metadata'.operation) request for subnet $subnet"
 
    $ipamServer = 'ipam.apps.example.com'
    $ipamUser   = 'svc-vra'
    $ipamPass   = 'VMware1!'
    $ipamBaseURL = 'https://'+$ipamServer+'/api/'+$ipamUser+'/'

    # Login to the API with username/password provided.  Create header to be used in next requests.
    write-host "IPAM Login"
    $ipamLogin = (Invoke-RestMethod -Uri "$($ipamBaseURL)user" -Method Post -SkipCertificateCheck -Headers @{'Authorization'='Basic '+[Convert]::ToBase64String([Text.Encoding]::ASCII.GetBytes($ipamUser+':'+$ipamPass))}).data.token
    $nextHeader = @{'phpipam-token'=$ipamLogin}

    # Get the subnet ID of the specified CIDR
    write-host "IPAM Get Subnet ID"
    $subnetID = (Invoke-RestMethod -URI "$($ipamBaseURL)subnets/cidr/$subnet" -SkipCertificateCheck -Headers $nextHeader).data.id

    # Make a reservation and provide name/description
    write-host "IPAM Reserve Next"
    $postBody = @{hostname="$($hostname).lab.enterpriseadmins.org"; description='Requested via Automation Extensibility'}
    $myIPrequest = (Invoke-RestMethod -URI "$($ipamBaseURL)addresses/first_free/$subnetID" -SkipCertificateCheck -Method Post -Headers $nextHeader -Body $postBody).data
    
    # Send a DNS Request to Jenkins
    write-host "Jenkins DNS Request"
    $dnsBody = @{reqtype='add'; reqhostname=$hostname; reqipaddress = $myIPrequest; reqzonename='lab.enterpriseadmins.org'} | ConvertTo-Json
    Invoke-RestMethod -URI 'http://jenkins.example.com:8080/generic-webhook-trigger/invoke?token=VRA-dnsRecord' -Method Post -Body $dnsBody -ContentType 'application/json'

    # Return detail to vRA
    $outputs = @{
        address = $myIPrequest
        resourceName = $hostname
    }
    return $outputs
}

The IP address obtained from IPAM as well as the hostname are returned when this task completes.

Custom Action – Read

For our custom resource, we will also need to specify an action to read / check status of our resource. For my purposes, I really don’t need anything specific to be checked, so I simply return all the input parameters. This is the default function / template loaded when creating the action.

function handler($context, $inputs) {
    return $inputs
}

Custom Action – Delete

When we are finished with our deployment and ready to delete, the custom resource needs a ‘delete’ action to call. Again this is written in PowerShell and calls Jenkins to request the actual delete. Jenkins will then connect to DNS and IPAM to process the cleanup.

function handler($context, $inputs) {
    $ipAddress = $inputs.address
    $hostname = $inputs.name
    
    write-host "We've received a $($inputs.'__metadata'.operation) request for IP address $ipAddress and hostname $hostname"
     
    $removeBody = @{reqzonename='lab.enterpriseadmins.org'; operationType='remove'; reqhostname=$hostname; subnetOrIp = $ipAddress} | ConvertTo-Json
    Invoke-RestMethod -URI 'http://jenkins.example.com:8080/generic-webhook-trigger/invoke?token=RequestIpAndDnsRecord' -Method Post -Body $removeBody -ContentType 'application/json'
}

This code could easily have contacted IPAM and DNS as separate requests, but since the Jenkins job already existed with webhook support, I choose to follow that path for simplicity.

Create Custom Resource

In Aria Automation Assembler > Design > Custom Resources we can create a new resource which will run our above actions. I named my resource IPAM Next Address, set the resource type to Custom.IPAM.Request, and based the resource on an ABX user-defined schema. For lifecycle actions, I selected the above IPAM Next Address action for all three required types: create, read, and destroy. For starters I set the scope to only be available for my test project, and finally togged the ‘activate’ switch to make the resource available in blueprints.

Create Template

In Aria Automation Assembler > Design > Custom Template, the design for this request is super simple. There are three inputs: issueNumber, Name, and Subnet. The issue number is used for tracking and becomes part of the host name. The name is the unique part of the hostname, and the subnet is which network to use when finding the next address. My hostname ends up being h<issue-number-padded-3-digits>-<name-entered> (h is a prefix I use for test systems in my homelab). The subnet is a drop-down list with the networks I typically use for testing, defaulting to the selection I use most often.

formatVersion: 1
inputs:
  issueNumber:
    type: integer
    title: Issue Number
  Name:
    type: string
    minLength: 1
    maxLength: 25
    default: ip-01
  Subnet:
    type: string
    title: Subnet
    default: 192.168.10.0/24
    enum:
      - 192.168.10.0/24
      - 192.168.40.0/24
resources:
  IPAddress:
    type: Custom.IPAM.Request
    properties:
      name: h${format("%03d",input.issueNumber)}-${input.Name}
      subnet: ${input.Subnet}
      address: ''
      git-issue-number: ${input.issueNumber}

Deploy

Once I published a version of this design, I can now make a request from the service broker catalog. My request form only has a few required fields:

I added some functionality into the ‘create’ action to post a comment to my issue tracker letting me know that a new resource has been created. It is created with a task list check box, so that I can see there is an open item to review with this issue, as well as a link to the deployment.

When I look at the deployment, I can see when it was created, if it expires, and can use the actions drop down to delete the deployment. This delete action calls the Jenkins job mentioned above to remove the DNS record and release the IP address from IPAM.

Conclusion

Aria Automation can provide an interface to leverage existing workflows. This example shows how to create a deployment to track the lifecycle of a created resource, while leveraging an existing system to handle the actual task. This solves my cleanup / tracking issue for one off IP requests as well as getting all the requests submitted from a single console. Hopefully you can use pieces of this workflow in your own environment.

Posted in Lab Infrastructure, Virtualization | Leave a comment

Automating SSL Certificate Replacement with the Aria Suite Lifecycle API

Someone recently asked me if there was an API to replace the Aria Operations for Logs SSL certificate programmatically. In this case, Aria Suite Lifecycle was already deployed and used to manage multiple Aria Operations for Logs clusters, primarily used in regional data centers to forward events to a centralized instance. This meant that our ideal solution would leverage Aria Suite Lifecycle as well, adding the certificate to the locker prior to replacing the certificate in product. A colleague of mine recently published a blog post showing how to rotate Aria Suite Local Account Passwords using APIs and PowerShell: https://stephanmctighe.com/2024/12/20/rotating-aria-suite-local-account-passwords-using-apis-powershell/, so I used the style/splatting method he used for consistency in this post.

Due to the varied nature of requesting/approving certificates, I did not cover the process of creating a certificate signing request using APIs for this example. However, it is possible to do this via API as well. The ‘Create CSR and Key Using POST’ can be called with a POST operation to /lcm/locker/api/v2/certificates/csr as described here: https://developer.broadcom.com/xapis/vmware-aria-suite-lifecycle-rest-api/8.14//lcm-15-186.eng.vmware.com/lcm/locker/api/v2/certificates/csr/post/.

Workflow

I first worked through each of these steps by creating a new collection in Bruno and stepping through each API to understand the inputs/outputs and how everything worked together. Once complete, I looked through each of the requests from Bruno and converted them to a single PowerShell script, to be able to have the end to end workflow in a single document for reference. In the sections of the post below, I’ll step through each chuck of the script and add some additional context on why each section exists and what they do.

Setting up the script

For readability and usability, I decided to have a block of variables and paths at the very start of the script. In this section, you can see Aria Suite Lifecycle hostname/credentials, and basic auth string being defined. There are then a handful of filename/paths related to the certificate, root certificate, and key needed for the certificate I created from a Windows Certificate Services deployment. We then list the name of the Aria Suite Lifecycle environment containing the product we need to update. For demonstration purposes, I created an environment named h308-logs, which only contained a single product (Aria Operations for Logs).

# LCM connection detail
$lcmHost = 'cm-lifecycle-02.lab.enterpriseadmins.org'
$username = 'admin@local'
$password = 'VMware1!'
$authorization = "Basic $([System.Convert]::ToBase64String([System.Text.Encoding]::ASCII.GetBytes("$($username):$($password)")))"

# Certificate/environment detail
$newCertificateAlias = 'h308-logs-01.lab.enterpriseadmins.org_2025-01-14'
$newCertificateFolder = 'C:\Users\bwuchner\Downloads'
$newCertificateCSR  = 'CSR_h308-logs-01.lab.enterpriseadmins.org_Test.pem'
$newCertificateFile = 'CERT_h308-logs-01.cer'
$newCertificateRoot = 'CERT_rootca.cer'
$environmentName = 'h308-logs'

Reading the certificate files

Our certificate consists of multiple files.

  1. private key, which is at the end of the certificate signing request (CSR) file that was generated by the Aria Suite Lifecycle GUI.
  2. certificate file, which was obtained from our certificate authority and contains subject alternative names for our Aria Operations for Logs hostname and IP address
  3. The root certificate from our certificate authority. In this lab, there are no intermediate certificates required. If they were, they could be added to the $cert variable below.

When using Get-Content, by default PowerShell will read one line of the file at a time. In the examples below, we join each new line with a new line character (`n) so that the API will understand our request. Failure to do so might result in an error like parsing issue: malformed PEM data encountered, LCM_CERTIFICATE_API_ERROR0000, or Unknown Certificate error.

# When we generated a CSR in the UI, before sending it to our CA, the private key is at the end of the
# CSR file.  We'll read that file, loop through and find the start/end of the private key, then format
# it to send in our JSON body
$key = Get-Content "$newCertificateFolder\$newCertificateCSR"
$keyCounter = 0
$key | %{if($_ -eq '-----BEGIN PRIVATE KEY-----'){$keyStartLine=$keyCounter};  if($_ -eq '-----END PRIVATE KEY-----'){$keyEndLine=$keyCounter}; $keyCounter++ }
$key = ($key[$keyStartLine..$keyEndLine] -join "`n") 

# We'll also read in our cert and concatenate each line with a new line character.
# If we have intermedate certs they can be joined in a similar way
$cert = ((Get-Content "$newCertificateFolder\$newCertificateFile") -join "`n") + "`n"
$cert += ((Get-Content "$newCertificateFolder\$newCertificateRoot") -join "`n") + "`n"

Adding the certificate to the locker

We can POST our new certificate/key combo to the /lcm/locker/api/v2/certificates/import API. It will return details on the certificate, such as the alias provided, the validity, and sha256/sha1 hashes. It does not return the ID of the certificate in the locker, which we’ll need in a future step. Therefore, now seemed like a good time to get the certificate by filtering for the Alias name we used in our original request.

$Splat = @{
    "URI"     = "https://$lcmHost/lcm/locker/api/v2/certificates/import"
    "Headers" = @{
        'Accept'        = "*/*"
        'Content-Type'  = "application/json"
        "Authorization" = $authorization
    }
    "Body"    = @{
        'alias'         = $newCertificateAlias
        'certificateChain' = $cert
        'privateKey'    = $key
    } | ConvertTo-JSON
    "Method"  = "POST"
}
$NewCertPost = Invoke-RestMethod @Splat
# the newcertpost variable will have detail on our certificate, its validity, and san fields.
# we will need cert ID, so we'll make a query for it.
$Splat = @{
    "URI"     = "https://$lcmHost/lcm/locker/api/v2/certificates"
    "Headers" = @{
        'Accept'        = "*/*"
        'Content-Type'  = "application/json"
        "Authorization" = $authorization
    }
    "Method"  = "GET"
}
$lockerCertId = ((Invoke-RestMethod @Splat).Certificates | ?{$_.alias -eq $newCertificateAlias}).vmid

Depending on what parts of this process we want to automate, it would also be possible to just get the ID of the certificate from the locker in the GUI. When we view the specific certificate, the ID is the GUID we see in the address bar, right after /lcm/locker/certificate:

Finding the environment ID

To replace the product certificate, we’ll need to know which environment ID needs to be updated. We can find this information from the API or the GUI. We’ll start by doing a GET operation for all environments, then filtering by the environment name variable declared at the beginning of the script.

# now that we have our new cert in the locker, we can apply it to the product
# Get Environment ID
$Splat = @{
    "URI"     = "https://$lcmHost/lcm/lcops/api/v2/environments?status=COMPLETED"
    "Headers" = @{
        'Accept'        = "*/*"
        'Content-Type'  = "application/json"
        "Authorization" = $authorization
    }
    "Method"  = "GET"
}
$Environments = Invoke-RestMethod @Splat

# find our specific environment ID
$environmentId = ($Environments |?{$_.environmentName -eq $environmentName}).environmentId

When we are looking at our specific environment in the GUI, the ID can be found in the address bar right after /lcm/lcops/environments:

Finding the product ID

The product ID is also needed for the certificate replacement request. After running the above code block that creates the $Environments variable, we can see a list of product IDs using the code below. It again filters the list and selects all appliable products in our specific environment:

# we also need to know the product ID.  We can get a list of product IDs for the above environment using
# the example below.  In this case we only have Ops for Logs, aka vrli
# ($Environments |?{$_.environmentName -eq $environmentName}).products.id
# vrli

There isn’t a clear way that I found to easily see this product ID from the GUI. However, if you are looking at a specific product, and select … > Export Configuration > Simple, the resulting file name should contain the product ID (example: h308-logs-vrli.json).

To make this more like a multiple-choice question, the values that I currently have across all products in my lab are listed below:

  • vidm
  • vra
  • vrli
  • vrni
  • vrops
  • vssc

Validating the certificate

In the section below, we are using POST to start the pre-validate API to make sure our certificate will work. This API will only return the request ID of the task that is created. We can view progress of the request in the GUI, using something like https://cm-lifecycle-02.lab.enterpriseadmins.org/lcm/lcops/requests/acd529f9-e8af-4c61-9d6d-14ee15730c9d, where the value of $pevalidateRequest is the GUID at the end of our URL. However, in our code block we also wait 30 seconds, then GET the status of our request from the API. We need this to return COMPLETED prior to moving on to the next step. This sample code block does not have error checking/handling as it is primarily an example of calling the APIs.

# Now that we know all the relevant IDs, we can verify our new cert will work.
$Splat = @{
    "URI"     = "https://$lcmHost/lcm/lcops/api/v2/environments/$environmentId/products/vrli/certificates/$lockerCertId/pre-validate"
    "Headers" = @{
        'Accept'        = "*/*"
        'Content-Type'  = "application/json"
        "Authorization" = $authorization
    }
    "Method"  = "POST"
}
$prevalidateRequest = (Invoke-RestMethod @Splat).requestId

# lets confirm that our validation completed.
# we may need to wait/recheck here
Start-Sleep -Seconds 30

# Lets ask the requests API if our task is complete.
$Splat = @{
    "URI"     = "https://$lcmHost/lcm/request/api/v2/requests/$prevalidateRequest"
    "Headers" = @{
        'Accept'        = "*/*"
        'Content-Type'  = "application/json"
        "Authorization" = $authorization
    }
    "Method"  = "GET"
}
(Invoke-RestMethod @Splat).state  # we want this to return 'COMPLETED'.  If it didn't we should recheck/fail/not continue.

Replacing the certificate

Assuming our pre-validate request above completed, we can move on to the certificate replacement. We do that with a PUT method to our certificate and provide the ID of the certificate in the locker. The PUT only returns the request ID of our task.

# Assuming the above completed, lets keep moving and actually replace the cert.
$Splat = @{
    "URI"     = "https://$lcmHost/lcm/lcops/api/v2/environments/$environmentId/products/vrli/certificates/$lockerCertId"
    "Headers" = @{
        'Accept'        = "*/*"
        'Content-Type'  = "application/json"
        "Authorization" = $authorization
    }
    "Method"  = "PUT"
}
$replacementRequest = (Invoke-RestMethod @Splat).requestId

Checking request status

As mentioned above, in the validating the certificate section, we can query the certificate status from the API as well. This is the same code block as used in the earlier section, only changing the value of the variable at the end of our request URI.

# Once we start the replacement we should wait a bit of time and then see if it is complete
Start-Sleep -Seconds 30
$Splat = @{
    "URI"     = "https://$lcmHost/lcm/request/api/v2/requests/$replacementRequest"
    "Headers" = @{
        'Accept'        = "*/*"
        'Content-Type'  = "application/json"
        "Authorization" = $authorization
    }
    "Method"  = "GET"
}
(Invoke-RestMethod @Splat).state  # we want this to return 'COMPLETED'.  If it returns 'INPROGRESS' we may want to wait/recheck until 'COMPLETED'.

As mentioned before, we can view the status of our request in the GUI as well. The URL would be https://cm-lifecycle-02.lab.enterpriseadmins.org/lcm/lcops/requests/acd529f9-e8af-4c61-9d6d-14ee15730c9d, where the value of $replacementRequest is the GUID at the end of our URL. Alternatively, we could look in the requests tab for the request name of VRLI in Environment h308-logs - Replace Certificate.

Follow up tasks

After replacing a certificate, it is always a good idea to verify that the new certificate is trusted in various other products. For example, if you are using CFAPI to forward logs to this Aria Operations for Logs instance, you should check the source systems to make sure they trust this new certificate. In addition, Aria Operations and Aria Operations for Logs can be integrated. From the Aria Operations integration, check and confirm that Aria Operations for Logs is trusted after completing this change. This is not specific to the API, just a reminder to ensure that new certificates are trusted, whether or not they are replaced in the GUI or using the API.

Conclusion

In this post, we’ve explored how to automate the replacement of an SSL certificate in Aria Operations for Logs using the Aria Suite Lifecycle API. By leveraging PowerShell and the API’s various endpoints, we can streamline the process of managing certificates across Aria Suite environments, ensuring better security and consistency.

Remember, while the steps outlined here focus on certificate replacement, this workflow can also be adapted for other automation tasks within Aria Suite Lifecycle. As with any automation effort, it’s important to test thoroughly in a controlled environment and validate that all systems are properly configured and trust the updated certificates.

Whether you’re managing a single Aria Operations for Logs instance or multiple clusters, automating tasks like certificate replacement can significantly reduce manual effort and minimize downtime. Please continue to explore further API capabilities to enhance your operational efficiency and security posture!

Posted in Lab Infrastructure, Scripting, Virtualization | Leave a comment

Unlocking the Power of Metric-Based Search in Aria Operations

When managing a large, virtualized environment, finding objects in Aria Operations can be challenging, especially when you don’t know the object name. Metric-based search, a feature introduced in Aria Operations 8.12, allows you to search for objects based on their metrics or properties—empowering you to quickly identify issues, even without specific names.

I recently posted about replacing some CPUs in my primary homelab system (https://enterpriseadmins.org/blog/virtualization/how-i-doubled-my-homelab-cpu-capacity-for-200-xeon-gold-6230-upgrade/). Prior to making this change, I knew I had a couple of VMs with rather high CPU Ready values. I suspected that the CPU ready would have decreased given the additional cores. I had an idea of a couple of VMs that were likely affected but wanted to leverage metric-based search to make sure I wasn’t missing any.

What Is Metric-Based Search?

Metric-Based search was introduced in Aria Operations 8.12 almost two years ago (https://blogs.vmware.com/management/2023/04/metric-based-search.html). It allows us to use metrics and properties in our search queries. Instead of typing a VM name, we can type a query for all VMs with high CPU Ready or Usage, like this:

Metric: Virtual Machine where CPU|Ready % > 2 or CPU|Usage % > 20

We start out by typing ‘Metric’, telling the search box we want to search using a metric, we then specify the object type of virtual machine, and finally use a where clause to provide additional metrics we wish to look at. The search bar helps auto-complete the entries and will have a green check once we have the syntax correct.

In this case the query only returns one VM… my Aria Automation VM which currently has >20% CPU usage. I’m not able to use the ‘transformation’ selection, because the environment has 225 VMs, which is larger than the maximum scope of 200 as called out in the tool tip below:

Using the ‘ChildOf’ Clause to Narrow Down Results

To refine my search results, I use the ‘childOf’ clause, which allows me to narrow down the query to a specific ESXi host. This is especially useful when I know the VMs I’m looking for are on the same host but don’t know their names.

Metric: Virtual Machine where CPU|Ready % > 2 or CPU|Usage % > 20 childOf core-esxi-34.example.com 

This unlocked the filter ‘transformation’ drop down list, and I can now look at maximum values instead of the current values. I could have used a different object in my childOf query, like a vSphere Folder, distributed port group, Datacenter, or custom datacenter — really any object that is a parent of virtual machine in the inventory hierarchy. We can see that more VMs now match our criteria. Each of these VMs had CPU Ready above 2% prior to installing the new CPUs. After installing new CPUs the values are much lower.

Understanding the Impact of CPU Speed on Performance Metrics

Interestingly, in the above images we can see that while CPU Ready has decreased substantially, CPU Usage has actually increased. I believe this to be due to the clock speed of the CPU cores. Previously the cores were 3.8ghz but they are now 2.1ghz. To do the same amount of work, the slower clock speed CPUs must run at a higher percentage.

Other Use Cases for Metric-Based Search

The side-by-side comparison of metrics in the metric-based search are really helpful. It included the CPU Ready and CPU Usage values as those properties were the first two metrics that are part of my query. If I adjust my query to have three metrics, such as:

Metric: Virtual Machine where CPU|Ready % > 2 or CPU|Usage % > 20 or Memory|Usage % > 5 childOf core-esxi-34.example.com 

I can select which metric is displayed in the left or right column using the column selector in the bottom left of the screen:

In the above examples, we are looking specifically at metrics of VMs. However, we can query properties the same way as well, and also query for different object types. Here are a few examples:

VMs that have more than 5 VMDKs (property): metric: Virtual Machine where Configuration|Number of VMDKs > 5

ESXi hosts that have less than 16 CPU cores (metric): Metric: Host System where Hardware|CPU Information|Number of CPU Cores < 16

Datastores with reclaimable orphaned disks (metric) and type (property): Metric: Datastore where Reclaimable|Orphaned Disks|Disk Space GB > 1 and Summary|Type equals 'NFS'

Conclusion: The Power of Metric-Based Search in Aria Operations

Metric-based search in Aria Operations is a powerful tool that helps you find the right objects even when you don’t know their names. By leveraging metrics like CPU usage or memory usage, you can quickly identify performance bottlenecks and optimize your virtualized infrastructure.

Posted in Lab Infrastructure, Virtualization | Leave a comment

How I Doubled My Homelab CPU Capacity for $200: Xeon Gold 6230 Upgrade

In this post, I’ll walk you through how I solved a growing CPU bottleneck issue in my homelab by upgrading the CPUs in my Dell Precision 7920. I’ll share the process, challenges, and cost-effective solution that allowed me to double my system’s CPU capacity.

The primary system in my homelab is a Dell Precision 7920 tower. I purchased it on eBay with 2x Xeon 5222 CPUs and 512GB of RAM about 2 years ago, replacing a pair of older HP DL360 Gen8 rack mount systems. The older HP systems had a pair of E5-2450L CPUs, 8 cores/ea at 1.8ghz for a total of 28.8ghz per system… but these systems were primarily constrained by RAM and not CPU. Based on some rough math, I made the decision to go from a total of 32 cores at 1.8ghz to just 8 cores at 3.8ghz.

In the first ~6 months, everything was great. Neither CPU or RAM were bottlenecks, everything was running well. However, as I added more and more nested environments (including nested VCF) I started running into CPU contention. Last year (early 2024), I knew that this cluster CPU usage was high. I could see from Aria Operations that CPU demand was well above the usable capacity most of the time.

CPU Demand of 30-Greenfield cluster, taken in early 2024

Around that time I looked into replacement CPUs for this system. I attempted to drop in some very low cost Xeon Gold 6138 CPUs (1st Gen Scalable) as they were very inexpensive (around $50/pair). Unfortunately, these CPUs were not compatible with the RAM in this system. The memory configuration is 8x64GB 2933 MHz DIMMs, which really limited my CPU choices to only those which run at 2933MHz (based on the table on page 91 of the owners manual). The 2nd Gen Scalable CPUs were preferred, as they are not expected to be deprecated in the next major vSphere release (per https://knowledge.broadcom.com/external/article/318697/cpu-support-deprecation-and-discontinuat.html). I had decided the best two options would be the Xeon Gold 6238 (22cores/socket at 2.1ghz) or 6230 (20cores/socket at 2.1ghz). Around the time these CPUs were running about $500/ea (6238) or $350/ea (6230) from various eBay sellers. I decided to hold off on the replacement and instead turn off/on certain environments as needed instead of running them all the time.

A few weeks ago, when running most of my nested environments concurrently again, I was seeing high CPU use. I did a bit more research and confirmed that the 6238 and 6230 CPUs were still solid options for what I needed, but now the price had fallen to 350/ea (6238) or $95/ea (6230). The 6238 CPUs would provide a total of 92ghz of capacity, while the 6230s would deliver 84ghz. Given that the demand for the cluster is only around 45ghz, the lower cost 6230s were about 2x the capacity I needed like a solid option. I decided to pick up a pair of and get them switched out. In the chart below, you can see that a few days prior to the “Now” line, the usable capacity of this cluster more than doubled. Aria Operations now shows that we are >1 year remaining until CPU runs out.

CPU Demand of 30-Greenfield cluster, taken in Jan 2025 after replacing CPUs

Conclusion

I knew that CPU usage was high, and that the most obvious solution was to add additional capacity. Even after narrowing down the options to just two, helped primarily by memory constraints, having specific capacity history helped make the most cost-effective decision. Instead of spending $700 on a pair of 6238 CPUs, I was able to solve the issue with just $200 for a pair of 6230 CPUs. After making the change, reviewing the same chart confirmed that the issue is in fact solved.

Posted in Lab Infrastructure, Virtualization | 1 Comment

Getting Started with Bruno: A Beginner’s Guide to Simplifying API Interactions

I’ve recently been working with APIs more than ever. A colleague recently introduced me to Bruno, an offline, open-source API client. For the most part, I had been interacting with APIs using swagger UIs in product or with PowerShell’s Invoke-RestMethod. This is sometimes challenging, such as remembering complex URLs, managing headers, or handling authentication. Bruno provides a standalone GUI to help streamline some of these tasks. As I was getting up to speed with the interface, I reviewed several prior posts to connect to various APIs I was familiar with. The following notes are what I learned while getting started with Bruno.

Example 1: APC BackUPS

In a previous article we explored creating our own custom ‘API’ using a powershell HTTP listener. This was a very basic example as it required no authentication or special headers. Since this API is so simple, its an easy first example for Bruno.

  • Create new collection
  • Create new request
    • Name the request (getStats)
    • Specify the URL (http://servername.example.com:6545/getStats)
    • Run (the right arrow at the end of the URL).

The response should show our JSON body that we crafted in our script. Now if I need to run this again, I don’t have to remember the hostname/port number for my service, I can just hit go and get the current response.

Example 1b: Using a Bruno ‘environment’

In the top right of a Bruno collection, there is a drop down list that says ‘No Environment’. If we select that drop down and ‘Configure’ we can create a new environment. This environment is a place to store variables, like server names, ports, and credentials. For my example, I’m going to create an environment named ‘server room’. In the ‘server room’ environment, I’ll define a variable named apcBackupsHost with the value servername.example.com). With this environment variable defined, I can edit my URL to use this variable name, enclosed with a pair of squiggly braces as shown below:

If I had multiple hosts running this API, I could create different environments for each. That way I can toggle between them using the environment drop down list and not need to update any of my API calls. Using this environment functionality can help save time when working with different environments (e.g., production vs. staging) and they can help prevent errors when managing credentials or server names.

Example 2: VI/JSON

The next example comes from a prior post as well — Using VI/JSON with Powershell. VI/JSON was introduced in vSphere 8.0U1 as a way of accessing the vSphere Web Services SDK via a REST interface. To get started with this in Bruno, we’ll make a new collection with a POST request to login. We’ll also make an environment for this collection that has four variables:

  • VC = vCenter Server name or IP
  • vcVersion = the version used in our request (8.0.3.0 in my case)
  • username = the username used to connect to vCenter Server
  • password = the password used to connect to vCenter Server.

I’ve named my request ‘Login’ and set a few properties. First the URL is https://{{VC}}/sdk/vim25/{{vcVersion}}/SessionManager/SessionManager/Login, which contains two of the variables from my environment. The body of the login contains the other two variables, as pictured below:

In addition to the Body I’ve made two other tweaks to this request. You can see where tweaks have been made in the above screenshot… any tab with a change has an indicator count. I’ve outlined the specific changes below:

  • Headers:
    • Name = Content-Type
      • Value = application/json
      • Vars > Post Response:
      • Name: vmware-api-session-id
        • Expr: res.headers['vmware-api-session-id']

The post response variable says to take the vmware-api-session-id response header value and save it in a variable for future use, like in our next request.

My second request I named ‘Get VM’ and is a GET of https://{{VC}}/sdk/vim25/{{vcVersion}}/VirtualMachine/vm-31/config, where vm-31 is the managed object reference ID of a specific VM. For this request, I’ve set two headers, the content-type=application/json and vmware-api-session-id={{vmare-api-session-id}}, which uses the variable we retrieved from the login request as shown below:

With these two headers defined, we can send our request, and it will retrieve the configuration details of our specific VM.

If there is another request we need to make in this same collection, we can right click the name of our request (Get VM in this case) and clone the request. This will make a new request with the same customized values already populated. This allows us to simply change the URL and submit a different request. For example, if I want to get details about all my license keys, I can change the URL to https://{{VC}}/sdk/vim25/{{vcVersion}}/LicenseManager/LicenseManager/licenses. The headers are already populated so I can send the request (CTRL + Enter is the default key binding for this task) and we’ll have a JSON body showing all of our license keys.

Example 3: Aria Ops Casa API

Finally, in another previous post we looked at logging into the Aria Operations Casa API using an LDAP account. This is a bit more difficult as we needed to base64 encode a username:password string to pass as a header for authentication. Lets see if we can do the same in Bruno.

  • Create new collection Aria Ops Casa
  • Create new request Casa Login
  • Create new environment lab with three variables: vropsServer, vropsCasaLdapUser, vropsCasaLdapPass and enter appropriate values. For the password I checked the ‘secret’ checkbox.
  • For the request type we’ll select POST and for our URL we will enter https://{{vropsServer}}/casa/authorize
  • On the script tab, we’ll build a ‘pre request’ to do some of the heavy lifting for authentication. Specifically, we’ll use a built-in function to do base64 encoding of our username/password string and then set our request Authorization header using that string. Sample code below:
const btoa = require("btoa");
var b64login = "vrops-ldap " + btoa(bru.getEnvVar("vropsCasaLdapUser")+":"+bru.getEnvVar("vropsCasaLdapPass"));
req.setHeader("Authorization", b64login );
  • On the Vars tab we’ll update the post response section to create a new variable named accessToken and use the expression res.body.accessToken to get the accessToken property from the body of the response.

Running the above request should get our server name, username, and password variables and use them to connect to the API. We’ll then create a new variable with the token we need for future requests.

To check the Aria Operations cluster status, we’ll start a new API request. This request must run after the above request, which populates the accessToken variable.

We now have this collection saved so we can easily access it in the future. If we have additional Aria Operations instances, we can copy the environments (so that all the variable names come over) and then update the variable values accordingly. This gives us a quick drop down to select which Aria Operations environment to query so we don’t need to re-enter username & passwords every time.

Conclusion

Bruno makes quick work of firing off a simple API call. The collections and environments are useful, especially when we have many endpoints we may want to query. I can see including this application as part of my API toolkit and you should consider it too. More information about Bruno can be found in the official docs at https://docs.usebruno.com/introduction/what-is-bruno.

Posted in Lab Infrastructure, Scripting | Leave a comment