Enterprise Admins.org

Simulating vCenter Server Connection Failures with iptables

Posted on September 1, 2025 by brian.wuchner

I was recently testing an application and wanted to see how it would behave if its connection to vCenter Server was interrupted. Would the process auto-recover? Would I need to restart a service? To find out, I simulated a connection failure using the built-in firewall on Photon OS. This type of testing can be helpful when validating resiliency, troubleshooting connection handling, or preparing for real-world outages.

The application was running on a Photon OS appliance, so I checked to see if the native iptables firewall was enabled using the following command:

systemctl status iptables.service

This returned a confirmation that the status was loaded/active, pictured below.

Since the firewall was enabled, I checked its configuration using the command:

iptables --list --line-numbers

This lists all the rules and their associated line numbers in the configuration. At the end of the configuration, I could see a block of OUTBOUND requests, which allow everything (based on rule 7).

Chain OUTPUT (policy DROP)
num  target     prot opt source               destination
1    ACCEPT     tcp  --  anywhere             anywhere             state NEW tcp dpt:https
2    ACCEPT     tcp  --  anywhere             anywhere             state NEW tcp dpt:http
3    ACCEPT     tcp  --  anywhere             anywhere             state NEW tcp dpt:ssh
4    ACCEPT     tcp  --  anywhere             anywhere             state NEW tcp dpt:https
5    ACCEPT     tcp  --  anywhere             anywhere             state NEW tcp dpt:http
6    ACCEPT     tcp  --  anywhere             anywhere             state NEW tcp dpt:ssh
7    ACCEPT     all  --  anywhere             anywhere
8    ACCEPT     icmp --  anywhere             anywhere             icmp echo-reply
9    ACCEPT     icmp --  anywhere             anywhere             icmp echo-reply

For my testing, I only needed to create a rule above number 7 that would DROP the requests to the specific vCenter Server my request was going to. I waited for the application to start, then added this firewall rule to drop requests to the vCenter Server, effectively simulating a network interruption:

iptables -I OUTPUT 7 -d 192.168.127.40 -j DROP

Caution: these changes are meant to be temporary and should only be used in test environments.

I then ran the same iptables --list --line-numbers command and confirmed that rule 7 was now my DROP entry and the previous rule 7 (that allowed all traffic) shifted down to rule number 8.

Finally, after testing, I could remove the rule:

iptables -D OUTPUT 7

Conclusion

Using iptables makes it easy to simulate a loss of connectivity to vCenter (or any other target system) without touching physical network infrastructure. This approach is lightweight, repeatable, and useful for testing application resiliency or recovery processes. Just remember that iptables changes made this way are not persistent across reboots, so they’re ideal for temporary testing in a lab or non-production environment.

Posted in Lab Infrastructure, Virtualization | Leave a comment

Blocking Internet Access in a vCenter Lab with Selective Proxy Access

Posted on August 29, 2025 by brian.wuchner

While troubleshooting a vCenter issue, I needed to replicate an environment where the vCenter Server had no internet access. This scenario is often seen in production but can be tricky to reproduce in a lab. This post will outline how I solved the issue quickly using incorrect default gateways and static routes.

Configure vCenter for No Internet Access

I deployed a new vCenter Server, using normal processes & correct networking settings. The resulting vCenter could reach the internet.

From the virtual appliance management interface (VAMI, port 5480), I selected the Networking option in the left navigation. From there I clicked the ‘Edit’ button in the top right pane. This VCSA only has one physical adapter, so I selected ‘Next’ and changed the IPv4 gateway on the second page to 192.168.0.2. In my lab, this is an unused IP address — the default gateway is actually 192.168.10.1.

I then continued through the workflow and acknowledged that I was ready to continue.

After making this change, I could ping the vCenter from the local subnet, but not from my admin workstation, which was the expected behavior.

I then modified the /etc/systemd/network/10-eth0.network file to append the following text:

[Route]
Gateway=192.168.10.1
Destination=192.168.0.0/16

This adds a static route so the vCenter Server now knows how to route to all devices, but only in my labs 192.168.0.0/16 network. To make this effective, I ran the following command:

systemctl restart systemd-networkd.service

After restarting networkd, I was able to ping the vCenter from both the local subnet and my admin workstation. However, from the vCenter I was unable to ping non-192.168.x.x addresses. This was the ideal configuration for my specific test.

Restricting Internet Access on a Windows Jump Host

Preventing the vCenter Server from reaching the internet was exactly what I needed. However, I decided to also setup a Windows Server based jump host to connect to this vCenter & I wanted it to be restricted from accessing the internet as well. I used the same process, but in Windows was able to save the configuration without a default gateway provided. To create a persistent static route, I used the following command:

route -P add 192.168.0.0/16 mask 255.255.255.0 192.168.10.1 metric 1

With this route defined, the jump host was able to reach internal addresses but not external addresses.

Adding Selective Internet Access with a Proxy

After blocking all the internet access, there were a few domains that I wished I had access to. To solve this I deployed an Ubuntu Linux VM and turned it into a proxy server by installing one package:

sudo apt install tinyproxy

I selected tinyproxy as it is lightweight, simple, and required minimal config. I edited the /etc/tinyproxy/tinyproxy.conf file, removing a few comments to enable the following settings:

Allow 192.168.0.0/16
Filter "/etc/tinyproxy/filter"
FilterDefaultDeny Yes

This allows the proxy server to listen for all devices in my lab, enables domain level filtering, and denies all requests by default. This allows me to selectively enable specific domains as needed and have logging to know which domains are attempted to be contacted. I can allow a domain by adding it to the /etc/tinyproxy/filter file and the restarting services with sudo service tinyproxy restart. To review which domains are attempted to be contacted, I just run the command:

sudo tail -f /var/log/tinyproxy/tinyproxy.log

I can configure the jump box or vCenter server to use this proxy by specifying its IP address and the default proxy port of 8888 (configurable in the tinyproxy.conf file).

Conclusion

This setup provided a flexible way to test how vCenter and supporting systems behave in restricted environments. By combining static routes with a filtered proxy, I could mimic a realistic enterprise scenario where internet access is tightly controlled—without losing the ability to selectively allow required domains.

Posted in Lab Infrastructure, Virtualization | Leave a comment

MongoDB on Ubuntu: Replica Sets, LDAP, and Aria Operations

Posted on August 25, 2025 by brian.wuchner

Last year I shared a series of posts walking through how to set up and monitor MongoDB with Aria Operations. For reference, here are those articles:

When I originally created those posts, Mongo DB had not yet added support for Ubuntu 24.04, so I used Ubuntu 20.04, as I had a template for that distribution. Recently I noticed these older Ubuntu 20.04 VMs, as Ubuntu 20.04 reached end of standard support earlier in the year. This post will review the updated setup steps to deploy a Mongo DB Replica Set on Ubuntu 24.04.

Installing Mongo DB 8.0 (latest) on Ubuntu 24.04

The Mongo DB documentation is very well written. I followed their steps from https://www.mongodb.com/docs/manual/tutorial/install-mongodb-enterprise-on-ubuntu/#std-label-install-mdb-enterprise-ubuntu. I’ll include a short code block below with the specific steps:

sudo apt-get install gnupg curl
curl -fsSL https://pgp.mongodb.com/server-8.0.asc | \
   sudo gpg -o /usr/share/keyrings/mongodb-server-8.0.gpg \
   --dearmor

echo "deb [ arch=amd64,arm64 signed-by=/usr/share/keyrings/mongodb-server-8.0.gpg ] https://repo.mongodb.com/apt/ubuntu noble/mongodb-enterprise/8.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-enterprise-8.0.list

sudo apt-get update
sudo apt-get install mongodb-enterprise

sudo systemctl start mongod
sudo systemctl status mongod
sudo systemctl enable mongod

After running the above commands, my systems were all running a default MongoDB service and that service was set to run automatically at boot.

I confirmed that I could connect to the instance by running mongosh at the console. This allowed me to connect automatically without specifying a password. While in the mongosh console, I created a dbadmin user account with the root role.

var admin = db.getSiblingDB("admin")
admin.createUser(
   {
       user: "dbadmin", 
       pwd: "VMware1!", 
       roles:["root"]
   })

After getting a successful response that my new user account was created, I exited the mongo shell by typing exit.

Configuring MongoDB for Replica Set and LDAP authentication

Back at the command line, I created a directory to store a security.key file to be used for each node in the replica set. I’ve included the details of these commands below:

cd /opt
sudo mkdir mongodb
sudo chown mongodb:mongodb /opt/mongodb

echo '88157a33a9dc499ea6b05c504daa36f8v2' | sudo tee /opt/mongodb/security.key
sudo chmod 400 /opt/mongodb/security.key
sudo chown mongodb:mongodb /opt/mongodb/security.key

With this file created & properly permissioned, we’ll update our mongo configuration file to specify the path to the security.key file. While we are in the file, we’ll add some additional settings for LDAP auth, as well as define the replica set name. We do this with vi /etc/mongod.conf and then make the following edits:

In the security section, we add:

  authorization: enabled
  keyFile: /opt/mongodb/security.key
  ldap:
    servers: "core-control-21.lab.enterpriseadmins.org:389"
    bind:
      queryUser: "CN=svc-ldapbind,OU=LAB Service Accounts,DC=lab,DC=enterpriseadmins,DC=org"
      queryPassword: "VMware1!"
    transportSecurity: "none"
    authz:
      queryTemplate: "{USER}?memberOf?base"
    validateLDAPServerConfig: true
setParameter:
  authenticationMechanisms: "PLAIN,SCRAM-SHA-1,SCRAM-SHA-256"

In the replication section we add:

  replSetName: svcs-rs-11

After updating the /etc/mongod.conf file on each host in my three node cluster, I restarted the service with the command sudo systemctl restart mongod

After the service restarted, I launched mongosh again. Now that authentication has been enabled, I set my database and then login using the following commnds:

use admin
db.auth({ user: 'dbadmin', pwd: 'VMware1!', mechanism: 'SCRAM-SHA-256' })

Next I initiated the replica set using the following syntax:

rs.initiate( {
   _id : "svcs-rs-11",
   members: [
      { _id: 0, host: "svcs-mongo-11.lab.enterpriseadmins.org:27017" },
      { _id: 1, host: "svcs-mongo-12.lab.enterpriseadmins.org:27017" },
      { _id: 2, host: "svcs-mongo-13.lab.enterpriseadmins.org:27017" }
   ]
})

This took a few seconds, but then returned the message { ok: 1 }. I double checked everything was running as expected by checking status rs.status() which returned details of the replica set, showing member nodes and which were primary vs secondary.

Creating custom role for monitoring and administration

I then created a custom role to be used by monitoring tools, like Aria Operations.

var admin = db.getSiblingDB('admin')
admin.createRole(
    {
        role: "CN=LAB MongoDB Ent Monitoring,OU=LAB Service Accounts,DC=lab,DC=enterpriseadmins,DC=org",
        roles: [ { role: "clusterMonitor", db: "admin" } ],
        privileges: []
    }
)

I also created a role to use for management. I could have done this with a single command by providing both roles when creating the role, but wanted to show an example of modifying an existing role as well.

var admin = db.getSiblingDB('admin')
admin.createRole(
    {
        role: "CN=LAB MongoDB Ent Admins,OU=LAB Service Accounts,DC=lab,DC=enterpriseadmins,DC=org",
        roles: [ "dbAdminAnyDatabase", "clusterAdmin"  ],
        privileges: []
    }
)

db.grantRolesToRole("CN=LAB MongoDB Ent Admins,OU=LAB Service Accounts,DC=lab,DC=enterpriseadmins,DC=org", [ { role: "root", db: "admin" } ] )

Loading sample data

Similar to the previous series of posts, I loaded some sample data into this replica set using the following syntax:

curl https://atlas-education.s3.amazonaws.com/sampledata.archive -o sampledata.archive
mongorestore --archive=sampledata.archive -u dbadmin -p 'VMware1!'

Using `mongosh` as an LDAP user

Since we have LDAP authentication configured, we can also login to the mongo shell as an LDAP user. The following syntax is an example of how to do so:

mongosh --username "CN=svc-mgdbeadm,OU=LAB Service Accounts,DC=lab,DC=enterpriseadmins,DC=org" --password 'VMware1!' --authenticationDatabase='$external' --authenticationMechanism="PLAIN"

In this case we specify that we want to use an external authentication database (LDAP) and the mechanism as ‘PLAIN’, which we previously enabled as an option when configuring the replica set & LDAP authentication.

Managing databases with a graphical user interface

When demoing the Operations management pack, it is often helpful to interact / show the databases. Workflows such as creating, deleting, renaming a database can be helpful. Doing these demos is often more interesting from a GUI instead of a command line. I recently found mongo-express, a web-based, graphical interface to manage Mongo DB databases. I ran this as a container as a test using the following sytax:

sudo docker run -p 8081:8081 -e ME_CONFIG_MONGODB_URL='mongodb://dbadmin:VMware1!@svcs-mongo-11.lab.enterpriseadmins.org,svcs-mongo-12.lab.enterpriseadmins.org,svcs-mongo-13.lab.enterpriseadmins.org/admin?replicaSet=svcs-rs-11' mongo-express

This connects to the Mongo DB service using our local dbadmin. The console shows us that we can use http://0.0.0.0:8081 to connect to the web interface with the username admin and password of pass. From this web interface we can see / edit / delete our databases during our demos. I’ve since wrapped this up in a docker compose file and exposed it with a reverse proxy to apply an SSL certificate.

Conclusion

With Ubuntu 20.04 out of standard support, refreshing to 24.04 was a necessary step, even in a lab. Getting current by rebuilding this replica set configuration was rather straightforward. Monitoring is continuing to work with the same Aria Operations management pack previously used, with only creating a new data source and reusing the previous credential object.

Posted in Lab Infrastructure, Virtualization | Leave a comment

Simplify Snapshot Management with VCF Operations

Posted on July 19, 2025 by brian.wuchner

Managing snapshots in vSphere environments is a task that folks have dealt with for years. I remember one of my first PowerCLI scripts was one that sent email notifications for snapshots over a week old to review and manually clean them up. In this post we’ll walk through one way of automating this cleanup using VCF Operations Automation Central.

In VCF Operations this is under Infrastructure Operations > Automation Central (or Operations > Automation Central, depending on version) we can create an automated job. There are several tiles available for automated jobs, but for this example we’ll use a ‘reclaim’ job:

For step 1 of our reclaim job, we’ll enter a job name & select ‘Delete old snapshots.’ We have an opportunity to add a description and specify various snapshot details, like only deleting snapshots older than 7 days, filtering by size or matching a specific snapshot name.

For step 2, we’ll define a scope, selecting specific objects that contain the VMs we want this automation to target. In the screenshot below, we’ve picked all of one vCenter, another Datacenter, and a specific cluster from a 3rd vCenter. This allows us to create different job scopes for different types of environments.

In step 3, we can define additional filter criteria. This is incredibly flexible. In the example below I’ve specified 3 different criteria combined with ‘and’ logic.

Tag ‘SnapshotPolicy’ not exists = means that there is no tag assigned to this VM with the category SnapshotPolicy. This would allow me to assign this tag category to some VMs with tags like ‘1 month’ or ‘manual’ and have separate jobs for them. This ‘not exists’ job would get all other VMs.
Metrics CPU|Usage (%) is less than 50% = would allow me to exclude VMs that are busy doing something.
Properties Configuration|Number of VMDKs is less than 5 = excludes VMs that have a lot of VMDKs.

We can add additional criteria on other metrics, properties, tags, object names, etc as needed.

On the final step 4 we can schedule how often this task runs. In my example this job is only running on Saturdays for the next year, and it will send a email updates as needed.

Conclusion

VCF Operations Automation Central is a very powerful tool and can be used to automate routine tasks such as snapshot removal. If you’re not yet using Automation Central, i’s worth exploring to streamline operations and reduce manual effort.

Posted in Lab Infrastructure, Virtualization | Leave a comment

Monitoring a Raspberry Pi with Telegraf and Aria Operations

Posted on June 13, 2025 by brian.wuchner

I recently set out to configure the open-source Telegraf agent on a physical system in my lab, with the goal of sending telemetry data to Aria Operations. The process for setting this up is documented here: https://techdocs.broadcom.com/us/en/vmware-cis/aria/aria-operations/8-18/vmware-aria-operations-configuration-guide-8-18/connect-to-data-sources/monitoring-applications-and-os-using-open-source-telegraf/monitoring-applications-using-open-source-telegraf/monitoring-applications-using-open-source-telegraf-on-a-linux-platform-saas-onprem.html. Since most of my lab systems are virtualized, the only physical candidate available was a Raspberry Pi running Ubuntu 24.04, and with its ARM-based CPU, I wasn’t sure if it would be supported.

Installing Telegraf on ARM (Ubuntu 24.04)

The first step was to install telegraf from the appropriate repository.

sudo curl -fsSL https://repos.influxdata.com/influxdata-archive_compat.key -o /etc/apt/keyrings/influxdata-archive_compat.key
echo "deb [signed-by=/etc/apt/keyrings/influxdata-archive_compat.key] https://repos.influxdata.com/ubuntu stable main" | sudo tee /etc/apt/sources.list.d/influxdata.list
sudo apt update
sudo apt -y install telegraf

I then needed to download the utility script to help configure telegraf to send to Aria Operations.

wget --no-check-certificate https://cm-opscp-01.lab.enterpriseadmins.org/downloads/salt/telegraf-utils.sh
chmod +x telegraf-utils.sh

The telegraf-utils.sh script requires an auth token. I accessed the Swager UI at https://ops.example.com/suite-api and used the /auth/token/acquire endpoint to generate the token. Here is the body I submitted.

{
  "username" : "svc-physvr",
  "password" : "VMware1!"
}

In this case, the svc-physvr is a user account created in the Aria Operations UI which maps to a limited access user account. The response body included the necessary token value, which I used when invoking the helper script:

sudo ./telegraf-utils.sh opensource -c 192.168.45.73 -t 24c884f0-2558-40fa-9626-61f577487ea5::7d209766-11f2-456a-a2d9-2a40b4459920 -v 192.168.45.73 -d /etc/telegraf/telegraf.d -e /usr/bin/telegraf

The parameters used in this script are explained in the product documentation.

Finally, I restarted the telegraf service.

sudo systemctl restart telegraf

Unfortunately that was met with an error.

Job for telegraf.service failed because the control process exited with error code.
See "systemctl status telegraf.service" and "journalctl -xeu telegraf.service" for details.

Looking at the logs, we could see that a certificate could not be read

sudo journalctl --no-pager -u telegraf

[...]
Jun  12 19:53:09 rpi-extdns-01 telegraf[1292]: 2025-06-12T18:53:09Z E! loading config file /etc/telegraf/telegraf.d/cloudproxy-http.conf failed: error parsing http array, could not load certificate "/etc/telegraf/telegraf.d/cert.pem": open /etc/telegraf/telegraf.d/cert.pem: permission denied
Jun  12 19:53:09 rpi-extdns-01 systemd[1]: telegraf.service: Main process exited, code=exited, status=1/FAILURE
[...]

I checked permissions on the cert.pem file and confirmed it was owned by root for user and group. The same was true for the key.pem file. I adjusted permissions for both files and tried again:

sudo chown telegraf:telegraf /etc/telegraf/telegraf.d/cert.pem
sudo chown telegraf:telegraf /etc/telegraf/telegraf.d/key.pem
sudo systemctl restart telegraf

This time no errors occurred. In short, the Telegraf service was unable to read its TLS certificate files because they were owned by root, but the service runs as the telegraf user. Fixing ownership resolved the issue.

Validating Success in Aria Operations

After waiting some time, I could see data in Aria Operations for this physical server. I first searched for the VM name and found an object called “Linux OS on rpi-extdns-01” (the servers hostname).

Clicking on that object allowed me to view metrics/properties which were collected. For example, the below screenshot shows the disk used over time for the root file system.

More details on this system could be found in the dashboard “Linux OS discovered by Telegraf.”

Conclusion

It’s great to have full visibility into this physical server using the same Aria Operations dashboards and alerts I already rely on for virtual systems. The setup was straightforward, and with a few tweaks for file permissions, the integration worked well even on a low-cost Raspberry Pi with an ARM processor.

Posted in Lab Infrastructure, Virtualization | 1 Comment

Simulating vCenter Server Connection Failures with iptables

Conclusion

Blocking Internet Access in a vCenter Lab with Selective Proxy Access

Configure vCenter for No Internet Access

Restricting Internet Access on a Windows Jump Host

Adding Selective Internet Access with a Proxy

Conclusion

MongoDB on Ubuntu: Replica Sets, LDAP, and Aria Operations

Installing Mongo DB 8.0 (latest) on Ubuntu 24.04

Configuring MongoDB for Replica Set and LDAP authentication

Creating custom role for monitoring and administration

Loading sample data

Using `mongosh` as an LDAP user

Managing databases with a graphical user interface

Conclusion

Simplify Snapshot Management with VCF Operations

Conclusion

Monitoring a Raspberry Pi with Telegraf and Aria Operations

Installing Telegraf on ARM (Ubuntu 24.04)

Validating Success in Aria Operations

Conclusion

Search

Categories

Archives

Simulating vCenter Server Connection Failures with iptables

Conclusion

Blocking Internet Access in a vCenter Lab with Selective Proxy Access

Configure vCenter for No Internet Access

Restricting Internet Access on a Windows Jump Host

Adding Selective Internet Access with a Proxy

Conclusion

MongoDB on Ubuntu: Replica Sets, LDAP, and Aria Operations

Installing Mongo DB 8.0 (latest) on Ubuntu 24.04

Configuring MongoDB for Replica Set and LDAP authentication

Creating custom role for monitoring and administration

Loading sample data

Using mongosh as an LDAP user

Managing databases with a graphical user interface

Conclusion

Simplify Snapshot Management with VCF Operations

Conclusion

Monitoring a Raspberry Pi with Telegraf and Aria Operations

Installing Telegraf on ARM (Ubuntu 24.04)

Validating Success in Aria Operations

Conclusion

Search

Categories

Archives

Using `mongosh` as an LDAP user