Aria Operations Self Health dashboard visibility

In Aria Operations there are several dashboards that are enabled by default to help ensure the Aria Operations environment is healthy. These dashboards are available under Visualize > Dashboards in a folder named VMware Aria Operations. The specific dashboards have names like:

  • Self Cluster Health
  • Self Health
  • Self Perfomance Details
  • Self Services Communications
  • Self Services Summary
  • Self Troubleshooting
  • vCenter Adapter Details

Recently I was working with a colleague who couldn’t see these dashboards. According to the documentation (https://docs.vmware.com/en/VMware-Aria-Operations/8.12/Best-Practices-Operations/GUID-8D7D3B14-6A4D-4895-B583-18753F03E48D.html) these dashboards should be activated by default. This led us to checking out permissions on these dashboards, where we found that by default they are not shared. Typically, built-in dashboards are shared with everyone, which can be seen by clicking the “share “Share Dashboard” icon in the top right, selecting the “Groups” tab, and reviewing the ‘Dashboard shard with” text at the bottom of the popup (shown below):

Example of a builtin dashboard shared with Everyone.

The built-in self health dashboards were not shared with anyone by default; only the builtin admin account has access. This seems reasonable, you likely don’t need/want everyone troubleshooting your Aria Operations cluster. However, for those tasked with maintaining Aria Operations deployments, these dashboards are very useful. If we are logged in as the admin user who owns these dashboards, from the above share dashboards screen we can select our custom administrators group and click “include” to start sharing the relevant dashboards. The next time members of the admins group login they should see these self health dashboards in the VMware Aria Operations folder.

Posted in Lab Infrastructure, Virtualization | Leave a comment

MongoDB: Test data for performance monitoring

This post will cover loading some test data into our MongoDB instance and generating some queries for performance monitoring. In previous posts we covered creating a MongoDB replica set (here) and configuring the Aria Operations Management Pack for MongoDB (here).

Reviewing the MongoDB website, there is a good article about some sample datasets: https://www.mongodb.com/developer/products/atlas/atlas-sample-datasets/. The MongoDB post covers importing the data using Atlas, then describes each data set. At the very end of the article, they cover importing this data with the mongorestore command line utility. As we do not have a GUI available with this Mongo instance, this is what we’ll do in this post.

The first step is to SSH into the primary node of our MongoDB replicaset. We can find this value on the MongoDB Replica Set Details dashboard in Aria Operations (its in the MongoDB Replica Sets widget at the top right in the column ‘Primary Replication’) or by using the rs.status() command in Mongo Shell discussed earlier in this series.

From the /tmp directory, we’ll download the sampledata archive using the command line utility curl like below:

curl https://atlas-education.s3.amazonaws.com/sampledata.archive -o sampledata.archive

The download will be about 372MB. Once we have the file, we will use the command line mongorestore command with the following syntax:

mongorestore --archive=sampledata.archive -u root -p 'password'

We can get the root password from the console of the first VM in our cluster, the one where we ran the rs.initiate earlier. The restore should complete rather quickly. Progress is written to the screen during the restore, but the final line in my output was:

2024-05-11T18:21:40.888+0000    425367 document(s) restored successfully. 0 document(s) failed to restore.

A couple hundred thousand records should be enough to work with for our needs — where we primarily want to make sure our monitoring dashboard is working.

Having data in our database isn’t really enough, we do need to have some queries running as well. I’m sure there are more complete/better load generating tools (such as YCSB), but after a quick search I found a couple of PowerShell examples for connecting to MongoDB (https://stackoverflow.com/questions/45010964/how-to-connect-mongodb-with-powershell). One is a module available in the PowerShell Gallery. It was easy to install with Install-Module Mdbc, so I gave this a shot. One of the first issues I encountered was with the default root password I was using. It had a colon in it, which is the character used to separate username:password in the connection string. I found a quick way to escape the special characters and a little more trial and error was able to create a connection string. One thing I ran into was the default readPreference assumed that all reads should come from the primary node, so neither of my secondary nodes were really doing anything. I ended up using the ‘secondaryPreferred’ method, so that I could see load on multiple nodes in the cluster.

$mongoPass = [uri]::EscapeDataString('wj:dFDgb6tom')
$mongoConnectString = "mongodb://root:$mongoPass@svcs-mongo-01.lab.enterpriseadmins.org,svcs-mongo-02.lab.enterpriseadmins.org,svcs-mongo-03.lab.enterpriseadmins.org/?readPreference=secondaryPreferred"

With the password escaped and the connection string built, it is easy to connect to the database. For example, to return a list of databases/collections from the mongo instance, I can run the following command:

Connect-Mdbc $mongoConnectString *

# List returned:
admin
config
local
sample_airbnb
sample_analytics
sample_geospatial
sample_guides
sample_mflix
sample_restaurants
sample_supplies
sample_training
sample_weatherdata

Running Connect-Mdbc $mongoConnectString sample_analytics * (adding a specific database name to the command) will return the three tables listed in the database. A few quick foreach loops later, we have a query that’ll run for a fairly long time, and we could easily make the loop have more iterations. It gives you some basic output to watch so you know it is working, and CTRL+C will let you exit the loop at any point.

$randomCounts = 2
1..1000 | %{
  $myResults = @()
  foreach ($thisDB in (Connect-Mdbc $mongoConnectString * |?{$_ -match 'sample'} | Get-Random -Count $randomCounts)) {
    foreach ($thisTable in (Connect-Mdbc $mongoConnectString $thisDB * | Get-Random -Count $randomCounts)) {
      Connect-Mdbc $mongoConnectString $thisDB $thisTable | Get-Random -Count $randomCounts
      $myResults += [pscustomobject][ordered]@{
        "Database" = $thisDB
        "Table"    = $thisTable
        "RowCount" = (Get-MdbcData -as PS | Measure-Object).Count
      } # end outputobject
    } # end table loop
  } # end db loop
  $rowsReturned = ($myResults | Measure-Object -Property rowcount -sum).Sum
  "Completed iteration $_ and returned $rowsReturned rows"
} # end counter loop

While running the above loop, I also went through and messed with cluster nodes, rebooting them to see what happens and see if queries failed. The cluster was more resilient than I had expected. This worked well to generate some CPU load on my Mongo VMs to populate an Aria Operations dashboard.

Posted in Lab Infrastructure, Virtualization | Leave a comment

Optimizing Operations: Aria Operations Management Pack for MongoDB

In a previous post (here), I covered how to setup a MongoDB replica set to be monitored by Aria Operations in a vSphere based lab. This article will cover the installation & configuration of the Aria Operations Management Pack for MongoDB. The article will be divided into three sections:

  • Installing the Management Pack – which will cover the Aria Suite Lifecycle Marketplace workflow
  • Configuring a service account – this will be the monitoring user for the MongoDB instances, configured inside Mongo Shell.
  • Adding a MongoDB adapter instance in Aria Operations – configuring Aria Operations to use the new management pack and service account.

Installing the Management Pack

In Aria Suite Lifecycle (formerly known as vRealize Suite Lifecycle Manager), I typically stay in the Lifecycle Operations, Locker, or Content Management tiles. In this post we’ll use the Marketplace tile to add a management pack for MongoDB. For details on creating a MongoDB Replica Set cluster, feel free to check out this previous post.

At the top of the Marketplace screen, there is a search box. When I search for Mongo I get three results. Hovering over each, there is one for Aria Operations Management Pack for MongoDB Version 9.0:

After selecting DOWNLOAD in the bottom right of the tile, we are presented the EULA. After reading the document, we check the box and select next. On the last page of this wizard, we enter our name, email address, company name, and country, then click DOWNLOAD.

This will show a Success. Download for VMware Aria Operations Management Pack for MongoDB 9.0 is initiated. For more details visit request page. The request text is a link to show the specific request. It should contain just one stage and complete fairly quick. Here is a screenshot of the expected results of the download task:

Back in the Marketplace, we switch to the ‘Available’ tab, which shows the subset of management packs that we’ve already downloaded. Due to the length of management pack names, it may be helpful to search for mongo in the search bar.

When we select ‘View Details’ a page will appear that provides some details about the selected management pack. If we scroll to the end of the page, there are buttons for INSTALL and DELETE. Since we are installing this for the first time, we’ll select INSTALL.

We must then pick our Datacenter and Environment and finally select the INSTALL button in the lower right.

This will result in a banner stating Installation in progress. Check request status. Request Status is again a link that takes us to the ‘Requests’ page. Again this should be a single stage task, but it’ll likely take longer than the download.

Once this task completes, our management pack will be available in the Aria Operations instance. We’ll just need to give it hostnames and credentials to begin collecting data.

Configuring a service account

From an SSH session to our primary MongoDB node, we’ll open a Mongo Shell with the following command:

mongosh --username root

We’ll enter our root password when prompted. One important thing to note, is that the root password for MongoDB is the root password of the node where we initiated the replica set.

// specify that we want to use the admin database instead of the default test
use admin

// Create our limited access service account for monitoring.  
db.createUser(
   {
     user: "svc-ariaopsmp",
     pwd: "VMware1!",
     roles: [ { role: "clusterMonitor", db: "admin" } ]
   }
)

// Confirm that it worked
db.getUser('svc-ariaopsmp')

When the above command completes, the final JSON object presented to the screen should look like this:

{
  _id: 'admin.svc-ariaopsmp',
  userId: UUID('2ee886ee-f700-46e7-b022-950fbb36034f'),
  user: 'svc-ariaopsmp',
  db: 'admin',
  roles: [ { role: 'clusterMonitor', db: 'admin' } ],
  mechanisms: [ 'SCRAM-SHA-1', 'SCRAM-SHA-256' ]
}

Which shows that we have a username ‘svc-ariaopsmp’ that has the ‘clusterMonitor’ role, which is a limited permission set for monitoring operations, just like what we are doing with Aria Operations.

Adding a MongoDB adapter instance in Aria Operations

From Aria Operations, we’ll navigate to the Data Sources > Integrations > Repository and verify we can see our new installed integration. If we click the tile, details come up about the management pack, including the metrics collected and content that is now available. We can select ‘ADD ACCOUNT’ from the top of the screen to create an adapter instance.

When adding an account there are only a few fields required:

The ‘Name’ field is used in some of the dashboards to identify which MongoDB instance that is being described. It is important to include a short descriptive name, in my case I’m going to call this ‘svcs-mongo-replicaset’ to denote that it is deployed in my services network, is running mongo, and is a replica set cluster.

The ‘Description’ field is not shown by default on the MongoDB dashboards added by th management back, but can contain additional details about the adapter instance. If you have various MongoDB instances managed by different teams, this might be a good place to store the contact, for example if your credentials expire or stop working.

The ‘Host’ field is where we enter the hostname of our MongoDB instance. Since we’ve configured a replica set, I’m listing a comma separated list of all the nodes of the MongoDB cluster. This will help ensure we are getting monitoring data even if one node fails.

The Credential allows us to specify whether or not authentication is required and enter username/password details. In my case, I do require username/password, and am only running mongod, not mongos, but I’m going to enter the same account in both fields in case something changes in the future. Here is what my new credential looks like:

Finally, I ‘VALIDATE CONNECTION’ and wait for the ‘Test connection successful’ notice to appear, then click ‘ADD’ at the bottom of the screen.

Additional advanced settings are available, like which service to connect to, specifics around SSL, timeouts, and autodiscovery. Those settings did not need to be tweaked for this simple environment.

After 5-10 minutes, we should have some initial data. Browsing to Visualize > Dashboards we can filter the list of dashboards for Mongo. There are 7 dashboards available from the management pack we installed related to MongoDB. Not all of them will have data, for example we don’t have Mongos deployed. However, if we select MongoDB Replica Set Details we should see some information about our environment.

To make the most of this, we’ll really need some sample data and a way to run queries as needed to have something to monitor. We’ll cover that in the next post (here).

Posted in Lab Infrastructure, Virtualization | 1 Comment

Creating a MongoDB Replica Set

I was recently looking at the Aria Operations Management Pack for MongoDB (https://docs.vmware.com/en/VMware-Aria-Operations-for-Integrations/9.0/Management-Pack-for-MongoDB/GUID-73744E17-88DD-49A1-8B86-5BD896C874D8.html) and wanted to kick the tires in my vSphere-based lab. To be able to test this management pack, I wanted to deploy a three node MongoDB replica set.

To begin, I found a solid starting point, the Bitnami MongoDB appliance: https://bitnami.com/stack/mongodb/virtual-machine. I downloaded this appliance which included MongoDB 7.0.9 pre-installed.

I deployed three copies of the appliance, specifying static IPs during the deployment. I pre-created forward and reverse DNS records for these IPs. The VMs come configured with 1vCPU and 1GB of RAM. When doing a replica set, I ran into a few issues with only 1GB of RAM, having 2GB of RAM seemed better. Once the VMs were powered on, I logged in with the bitnami/bitnami username/password combination and changed the bitnami password. I then made a few changes I made in the OS using the vSphere web console.

sudo nano /etc/ssh/sshd_config
# find PasswordAuthentication no and change to yes

sudo rm /etc/ssh/sshd_not_to_be_run
sudo systemctl start sshd

This allowed me to login over SSH to make the remaining changes. Over SSH I made some additional changes to allow MongoDB to be accessed remotely:

sudo nano /etc/nftables.conf  # find and add "tcp dport { 27017-27019 } accept" in the section below accepting tcp22
sudo nft -f /etc/nftables.conf -e

I also wanted to clean up a couple of general networking issues, namely setting the hostname of the VM and removing an extra secondary IP address that might get pulled via DHCP.

# set the OS hostname
echo mongo-xx.lab.example.com | sudo tee /etc/hostname

# disable the DHCP interface if using static addresses
sudo nano /etc/network/interfaces
# find the line at the end of the file for 'iface ens192 inet dhcp' and remove it or make it a comment.

# confirm DNS resolution is configured
sudo nano /etc/resolv.conf  # review for nameserver entries

sudo systemctl restart networking

Next up, we will make some changes to the MongoDB to support our replica set. We need a security key for nodes to talk with each other. Instead of using a short password, from my local machine running PowerShell, I created a new GUID to use as a security key for node communication and removed the hyphens.

[guid]::NewGuid() -Replace '-', ''

For me, this created the string 88157a33a9dc499ea6b05c504daa36f8 which I’ll reference in the document. There are other ways to create longer/more complex keys, but this was something quick. We need to put this key in a file that will be referenced in a mongo config file. To create the security key we’ll use the following command:

echo '88157a33-a9dc-499e-a6b0-5c504daa36f8' | sudo tee /opt/bitnami/mongodb/conf/security.key
sudo chmod 400 /opt/bitnami/mongodb/conf/security.key
sudo chown mongo:mongo /opt/bitnami/mongodb/conf/security.key

Now that we have that security key, we’ll update our mongo config.

sudo nano /opt/bitnami/mongodb/conf/mongodb.conf

# Find and remove the comments for replication and replSetName, and set a replication set name if desired.

# uncomment #keyFile at the end of the file and set value to /opt/bitnami/mongodb/conf/security.key

# save the file, restart services
sudo systemctl restart bitnami.mongodb.service

Now that the cluster nodes are all prepared, we can turn this into a replica set cluster. Since I’m not familiar with this process, I decided to create snapshots of all the nodes (PowerCLI makes this easy — get-vm mongo-0* | New-Snapshot -Name 'Pre RS config') and reboot them for good measure. I’ll log back into the first node of the cluster over SSH and enter the mongo shell interface, using the root password found on the virtual appliance console.

mongosh --username root
# enter the root password

# initiate the cluster
rs.initiate( {
   _id : "replicaset",
   members: [
      { _id: 0, host: "mongo-01.lab.example.com:27017" },
      { _id: 1, host: "mongo-02.lab.example.com:27017" },
      { _id: 2, host: "mongo-03.lab.example.com:27017" }
   ]
})

# check the status to ensure cluster is working
rs.status()

The rs.status() command should return the status of the cluster. We should see one of our nodes is the primary and the other two are secondary (based on the stateStr property).

With our replica set working, we can now start monitoring. The next post (link) will cover installing and configuring the management pack. The final post in this series (link) will show how to import some sample data and run sample queries to put read load on the replica set.

Posted in Lab Infrastructure, Scripting, Virtualization | 2 Comments

Ubuntu 24.04, Packer, and vCenter Server Customization Specifications

I’ve heard from several people who have highly recommended Packer (https://packer.io) to create standardized images. I’ve been wanting to dig into this for some time, and with the recent release of Ubuntu 24.04 decided now would be a good time to dig in. I plan on using this template for most Linux VMs deployed in my vSphere based home lab. In addition to the base install of Ubuntu, there are a handful of agents/customizations that I wanted to have available:

  • Aria Operations for Logs agent
  • Aria Automation Salt Stack Config minion
  • Trusts my internal root CA
  • Joined to Active Directory for centralized authentication

I ended up with a set of packer configuration files & a customization spec that did exactly what I wanted. With each install or customization, I tried to decide if it would be best to include the automation in the base image (executed by Packer) or the customization spec (executed by the customization script). Some of this came down to personal preference, and I might revisit the choices in the future. For example, I’ve placed the code to trust my internal CA into the base template. I may want to evaluate removing that from the template and having multiple customization specs to have an option where that certificate is not trusted automatically.

For those interested, I’ve summarized the final output in the next two sections, but also tried to document notes and troubleshooting steps toward the end of the article.

Packer Configuration

The Packer Configuration spans several files. I’ve described each file below and attached a zip file with the working configuration.

  • http\meta-data – this is an empty file but is expected by Packer.
  • http\user-data – this file contains a listing of packages installed automatically and some commands ran automatically during the template creation. For example, these commands will allow VMware Tools customization to execute custom scripts.
  • setup\setup.sh.txt – this is a script which runs in the template right before it is powered off. It contains some cleanup code and agent installs. You’ll need to rename this file to remove the .txt extension if you want it to execute.
  • ubuntu.auto.pkr.hcl – contains variable declarations and then defines all the virtual machine settings which are created.
  • variables.pkrvars.hcl – contains shared code (vCenter Server details, credentials, Datacenter, Datastore, etc) which may be consumed by multiple templates.

Download: https://enterpriseadmins.org/files/Packer-Ubuntu2404-Public.zip

With these files present in a directory, I downloaded the Packer binary for my OS (from: https://developer.hashicorp.com/packer/install?product_intent=packer) and placed it in the same directory. From there I only needed to run two commands.

./packer.exe init .
./packer.exe build .

The first command initializes packer, this will download the vSphere plugin we’ve specified. The second command will actually kick off the template build. In my lab this took ~6 minutes to complete. Once finished, I had a new vSphere Template in my inventory which could be deployed easily.

vSphere Customization Specification > Customization script

The customization spec includes things like how to name the VM, the time zone, network settings, etc. The part of this script which really helps with completing some of the desired customizations was the customization script. This took a bit of trial and error, described in the notes section at the end of this article. I’ve included the final script below as reference. This code runs as part of the virtual machine deployment and is unique to each VM.

#!/bin/sh
if [ x$1 = x"precustomization" ]; then
    echo "Do Precustomization tasks"
    # append group to sudoers with no password
    echo '%lab\ linux\ sudoers ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
elif [ x$1 = x"postcustomization" ]; then
    echo "Do Postcustomization tasks"
    # generate new openssh-server key
    test -f /etc/ssh/ssh_host_dsa_key || dpkg-reconfigure openssh-server
    # make home directories automatically at login
    /usr/sbin/pam-auth-update --enable mkhomedir
    # do a domain join and then modify the sssd config
    echo "VMware1!" | /usr/sbin/realm join lab.enterpriseadmins.org -U svc-windowsjoin --computer-ou "ou=services,ou=lab servers,dc=lab,dc=enterpriseadmins,dc=org"
    sed -i -e 's/^#\?use_fully_qualified_names.*/use_fully_qualified_names = False/g' /etc/sssd/sssd.conf
    systemctl restart sssd.service
fi

Notes / troubleshooting

Since I wanted to make this process as low touch as possible, so I needed to automate serveral agent installations and other customizations. With each

I had previously saved some sample configuration files for Ubuntu 22.04 (unfortunately I didn’t bookmark the original source). I cleaned up the files a bit, removing some declared variables that weren’t in use. I downloaded the Ubuntu 24.04 ISO image, placed it on a vSphere datastore, and updated the iso_paths property in the ubuntu.auto.pkr.hcl file and other credential/environmental values in the variables.pkrvars.hcl accordingly.

The initial build completed without incident, creating a vSphere template. The first deployment failed. Reviewing the var/log/vmware-imc/toolsDeployPkg.log file, the message ERROR: Path to hwclock not found. hwclock was observed. There was a KB article for this (https://kb.vmware.com/s/article/95091) related to Ubuntu 23.10, which mentioned that the util-linux-extra package was needed. I added this to the definition of packages in the user-data file and rebuilt the template using packer build. This resolved the issue and future deployments were successful.

One thing I noticed was that the resulting virtual machine had two CD ROM devices. I looked around and found a PR (link) stating that an option existed to control this behavior as of the vSphere 1.2.4 plugin. I updated the required_plugins mapping in the ubuntu.auto.pkr.hcl file to state this 1.2.4 version is the minimum required. I then added reattach_cdroms = 1 later in the file with the other CD ROM related settings.

One other thing that I noticed in this process was that it would have been helpful to have a date/time stamp either in the VM name or the notes field, just to know when that instance of a template was created. I looked around and found out how to get a timestamp and used that syntax to add a notes = "Template created ${formatdate("YYYY-MM-DD", timestamp())}" property to my ubuntu.auto.pkr.hcl file.

After making the above fixes, I deployed a VM from the latest template and applied a customization spec which contained a customization script do a few final customization tasks (update /etc/sudoers, generate a new openssh-server key, complete the domain join, make a change to the sssd configuration and finally restart ssd services. This script failed to execute, reviewing the /var/log/vmware-imc/toolsDeployPkg.log I noticed the message user defined scripts execution is not enabled. To enable it, please have vmware tools v10.1.0 or later installed and execute the following cmd with root privilege: 'vmware-toolbox-cmd config set deployPkg enable-custom-scripts true'. Back in my user-data configuration file, in the late-commands section, I added this command to enable custom scripts in the template.

After rebuilding the template to enable custom scripts, I deployed a new VM. This did not complete the domain join as I had hoped. All of my commands were running in a precustomization period, before the virtual machine was on the network. I found the following KB article: https://kb.vmware.com/s/article/74880 which described how to run some commands in precustomization and others during postcustomization. Moving the domain join to postcustomization solved this issue, as the VM was on the network when the domain join ran.

I wanted the templates to trust my internal CA, so I added a few commands to the setup.sh script to download the certificate file from an internal webserver and run update-ca-certificates.

The next task I wanted to complete was the installation of the Aria Automation Config (aka Salt Stack Config) minion. In the past I had used the salt-project version of the minion, but reviewing VMware Tools documentation (https://docs.vmware.com/en/VMware-Tools/12.4.0/com.vmware.vsphere.vmwaretools.doc/GUID-373CD922-AF80-4B76-B19B-17F83B8B0972.html) I found an alternative way. I added the open-vm-tools-salt-minion as a package in the user-data file and had Packer add additional configuration_parameters to the template to specify the salt_minion.desiredstate and salt_minion.args values.

I also wanted the template to include the Aria Operations for Logs (aka Log Insight) agent. The product documentation showed how to pass configuration during install (https://docs.vmware.com/en/VMware-Aria-Operations-for-Logs/8.16/Agents-Operations-for-Logs/GUID-B0299481-23C1-482D-8014-FAC1727D515D.html). However, I was having problems automating the download of the agent. Trying to do a wget of the link from the agent section of the Aria Ops for Logs console the resulting file was an HTML redirect. I found this article: https://michaelryom.dk/getting-log-insight-agent which provided an API link to download the package and I was able to wget this file. I placed the wget and install commands in the setup.sh script that runs right before the new template is powered down.

After rebuilding the template with packer, I deployed another test VM. I confirmed that:

  • SSH worked
  • AD Authentication worked
  • The Aria Ops for Logs agent sent logs
  • My internal CA was trusted
  • The Aria Automation Config minion was reporting (the key needed accepted in the console)

To repackage the template VM takes about 6 minutes. To deploy & customize the template takes about 2 minutes, but everything I wanted in the VM is ready to go.

Posted in Lab Infrastructure, Scripting, Virtualization | Leave a comment