vRealize Operations Alerts using Rest Notification Plugin

I have created several vRealize Operations (vROps) alerts in the past, mainly using the Log File Plugin and Standard Email Plugin. However, I recently had someone ask for more information on using the Rest Notification Plugin. I hadn’t used this, so I started looking for more detail on how to get started.

I found a couple really good blog posts on this, specifically https://blogs.vmware.com/management/2017/01/vrealize-webhooks-infinite-integrations.html and https://blogs.vmware.com/management/2018/02/extending-vrealize-operations-collaboration-tools-restful-apis-webhook-shims-david-davis-vrealize-operations-post-50.html . Both of these posts describe using an intermediary to accept what vROps is sending and convert it into a format that another endpoint expects. There are a handful of integrations provided, so I started looking at one that I could test with. The following post will describe the steps to get this working.

The Test Service
For testing, I’m going to use vROps to send an alert to Slack. This was pretty straight forward, I created a new channel where I wanted the alerts to appear, and then created a new incoming webhook by visiting https://api.slack.com/apps/new. I created a new app called vrealize-bot using my workspace. Once the app was created, I toggled on the ‘incoming webhooks’ feature, and mapped to my channel. This resulted in a webhook URL that looked like this:

https://hooks.slack.com/services/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

To confirm this was working, I used a quick PowerShell script to try and post to that webhook URL. This isn’t needed, but did prove that my web hook was correctly created.

Invoke-WebRequest `
   -Uri $webhookURL `
   -Method POST `
   -Headers @{"Content-type"="application/json"} `
   -Body (@{"text"="This is a test"}|ConvertTo-Json)

The ‘Shim’
We need a piece of code to covert from the vROps Rest output into a format Slack will accept. The first blog post mentioned above calls out a prebuilt tool to do this — called the loginsightwebhookdemo. There are instructions available on getting this running, but the easiest route for me was to use the docker image. I started by downloading the Photon OS 3.0 OVA, deployed it to an ESXi host, and then enabled docker using these instructions. I ran three commands… only the middle one is required, the other two will just show some supplemental info.

systemctl status docker
systemctl start docker
docker in

Once docker is running, you can start the webhook-shims container using these instructions. As described in the instructions, you launch the bash shell, which gives you the ability to edit files in the container file system to add things like our Slack API URL. If you choose this route, once the files are edited in the loginsightwebhookdemo directory, you’ll need to run ./runserver.py from the webhook-shims directory. However, since we are only using the Slack shim in this example, there is an easier way. All we need to do is pull & run the container using these commands:

docker pull vmware/webhook-shims
docker run -it -p 5001:5001 vmware/webhook-shims

With the container running we can access its info page at http://dockerhostnameorip:5001/. This will show everything is up and running and that you can connect to the website.

The vROps Alerts
From the Alerts > Alert Settings > Notification Settings area, we can add a new rule. We will select the Rest Notification Plugin method and add a new instance. We’ll name our instance SlackWebhook_vrealize-bot. We could enter anything we want here, but want to be descriptive as possible. This shows the service we are using, how we are connecting, and the application that will be doing the posting, which seems sufficient. The URL is where the magic happens. We’ll enter a URL like this:

http://dockerhostnameorip:5001/endpoint/slack/TTTTTTTTT/BBBBBBBBB/alphaNumericStr0ngOfText

This is the name of the host our container is on, the port that the service is exposed on, the endpoint/slack is to specifies which shim we want to use, and the TTTTTTTTT/BBBBBBBBB/alphaNumericStr0ngOfText comes from our Slack webhook at the beginning of the article. We will leave the username and password blank (all that authentication is done in our custom webhook URL). For content type we’ll select application/json. Pressing TEST should result in two new posts to our Slack channel.

Slack Channel posting from vRealize Operations Rest Notification Plugin / webhook shim

Now all we need to finish up is to define the filtering criteria for which alerts we want to be sent to Slack. For testing, I just set criticality to Immediate or Critical, but we will likely want to narrow that down over time as it is a bit too chatty.

Using Log Insight Agent to find insecure LDAP binds

I recently read an interesting article on the vSphere Blog: https://blogs.vmware.com/vsphere/2020/01/microsoft-ldap-vsphere-channel-binding-signing-adv190023.html, which states Microsoft is making a change to the default behavior of LDAP servers that are part of Active Directory. This change will require secure LDAP channel binding by default, and is scheduled to be implemented in March 2020. The article goes on to say that vCenter using Integrated Windows Authentication (IWA) is not affected, so my lab should be fine…right? I have systems other than vCenter connecting over LDAP, so I have a need to double check. I found a really good article from 2016 ( https://docs.microsoft.com/en-us/archive/blogs/russellt/identifying-clear-text-ldap-binds-to-your-dcs) that shows how to increase logging levels and also includes an event viewer view and PowerShell script to find these events. However, I wanted to make this same data visible as a dashboard in Log Insight. The following post will recap the required steps to build a similar dashboard for your environment.

The first step was to enable the additional logging. I did this by adding the following registry key to all domain controllers in the environment. I did this in a lab with very few domain controllers, so I just ran this command on each, one at a time:

# Enable Simple LDAP Bind Logging
reg add HKLM\SYSTEM\CurrentControlSet\Services\NTDS\Diagnostics /v "16 LDAP Interface Events" /t REG_DWORD /d 2 

My domain controllers were already running the Log Insight agent and the Active Directory content pack was already installed and configured. You can check this in your environment by going to Settings > Content Packs. If you do not see Microsoft-Active Directory listed under the Installed Content Packs header, you can get it from the Marketplace. When you install the content pack from Marketplace, it will display a set of instructions you need to follow to enable appropriate collection. If you have already installed the content pack, you can find these setup instructions by browsing to the content pack, clicking the ‘gear’ icon and selecting ‘setup instructions’ (screenshot below):

For example, you may need to enable additional audit logs in your Default Domain Controllers policy – the details are included in the popup. You will also need to create a copy of the agent configuration from Settings > Administration > Agents (among other settings, this will ensure the Directory Services log is captured — and that is where event 2889 is logged). Here is a sample of the agent configuration, showing that our newly created configuration applies to both of my lab domain controllers.

Now that Windows is logging eventid 2889, and Log Insight agent is picking that event up, we can focus on extracted fields and dashboards in Log Insight.

Extracted Fields
An extracted field is a regular expression definition of text we want to find in our event. If we look at a sample of one such event (screenshot below) we can see that there three fields that we likely want to capture:
– Client IP Address
– Identity the client attempted to authenticate as
– Binding Type

The easiest way to extract this event is to highlight the text we want to extract. A popup will appear and give us filtering choices as well as an option to extract a field. If we select that option, a new field definition will appear to the right of the page.

We can see that the default criteria is looking for an IPv4 address that appears after the text address: and before the text :38130. This is close to the text we want, but the pre-text is not as specific as we could be and the post text is too specific (the source port can be any random high numbered port). We can edit this default criteria
– Pre text: Client IP address:\s* [adding the Client IP prefix as shown in the example event]
– Post text: : [removing the ending port number, leaving only the single colon ]
We will also give this field a name (Ldap2889_ClientIP) and make it visible to all users. The resulting field definition should look like this:

Now that we know how to extract the data we want, we will repeat the process to create other extracted fields for Ldap2889_Identity and Ldap2889_BindingType.

Dashboard
Once we have extracted interesting fields, we can build visuals to consume that information. The easiest way to do this is from Interactive Analytics. We start by applying filters — in this example we add a filter for eventid = 2889. We can then add interesting groups of data. In the top section of interactive analytics we can see a timeline of findings. In the bottom left area we see a couple of drop downs. The first one starts ‘Count of events,’ which is typically the most logical grouping. There are other choices available so you should review those options. The second drop down list defaults to ‘over time,’ and this one is the one that is typically adjusted. For example, we can select ‘Non-time series’ and then Group By one of our custom extracted fields — for example, Ldap2889_ClientIP.

Screenshot of count of events grouped by Ldap2889_ClientIP

Using the options to the right, we could easily turn this visual from a column to a pie graph. Once we have it the way we like, we can use the ‘Add to Dashboard’ button in the top right to save this view for later. The popup will ask us for a visual name, and then to select the dashboard where we’d like to see this data. We can make a new dashboard from this same screen if needed, and even share with all other users. We can create other visuals, for example, groupings by Ldap2889_Identity or BindingType, or even include some time series metrics to see when most of the logins are occurring. Looking at my dashboard, I can see that I may have a bit more work ahead of that March 2020 deadline.

Ubuntu 18.04 Grow Filesystem

In a previous post we created an Ubuntu 18.04 template that only has a 16GB disk. The template is fairly lean, without many packages installed, but the root filesystem is already at 21% full. For many applications this will be enough space, but some will require more. In this post, we will grow that file system by extending the disk from 16GB to 50GB.

  • Check existing free space with the command df -h.
  • Use the vSphere client to increase the size of Hard Disk 1.
  • Rescan the devices to find the new space with the command: echo 1 > /sys/class/block/sda/device/rescan
  • Run lsblk to list available blocks.
  • Grow the partition by running growpart /dev/sda 1 (that’s the disk, a space, and the partition number)
  • Grow the filesystem with the command resize2fs /dev/sda1
  • Confirm the disk has grown with df -h

We now have 93% free on the root filesystem instead of the previous 79%.

Deploy Ubuntu 18.04 from Template

In a previous post we created an Ubuntu 18.04 template. When that template is deployed, we need to make a few changes to customize the VMs personality. To assist with that we’ll create a customization specification from Menu > Policies and Profiles > VM Customization Specifications.

  • Select New VM Customization Specification for Linux.
  • Use the virtual machine name for the computer name, and enter a domain name for the lab — I’ll use the same as my AD domain name
  • Select a time zone
  • We can use standard network settings (DHCP) or configure common network properties (like subnet mask and gateway) and prompt for IP address. If needed, we can make multiple customization specs depending on our needs.
  • Enter DNS server & search path settings

Deploy from the previously created temlate. Ensure that the VM name is unique, as it will be the computer name, and joined to AD. For this example, I’ll use: ubuntu1804e

  • Select compute/storage appropriate for the infrastructure
  • Check the box for customize the operating system, customize VM hardware (if needed, for example if we need to change to a different network port group), and power on VM after creation.

After a few minutes, the VM will be deployed, powered on, and we can see the name we entered and a DHCP assigned address exposed by VMware Tools:

Login as the template-admin user, then switch to root using su –. Enter the root password when prompted. In the template creation process, we added a script named joinad to the root users profile, so we could easily add this machine to the domain. To execute that script, simply run ./joinad.sh.

We are prompted to reboot, but before we do that, we should disable that template-user. The quickest way to make that happen is with:

usermod -L template-admin

Once that is complete, reboot. We now have a Ubuntu 18.04 image, supporting Active Directory authentication, ready to use.

Lab Updates: Ubuntu Server 18.04 LTS Template

Over three years ago I created a post for creating an Ubuntu 16.04 template for use in a lab environment. I’ve been using that template, with very minor updates, ever since. Ubuntu 16.04 LTS (Long Term Support) will stop receiving maintenance updates in just over a year, so I plan to start moving to 18.04 LTS. More information on Ubuntu release cycles can be found here: https://ubuntu.com/about/release-cycle.

To begin, we will download the ubuntu-18.04.3-server-amd64.iso file from http://cdimage.ubuntu.com/releases/18.04/release/ and upload it to a datastore. Next we will create a new virtual machine, entering names and selecting host/datastore/virtual hardware compatibility levels appropriate for our infrastructure.  Select OS Linux>Ubuntu Linux x64.  The default network adapter should be vmxnet3, but I changed the SCSI controller from the default LSI Logic Parallel to PVSCSI.  For the template, I stuck with 1vCPU, 1GB RAM, and 16GB disk (in a future post we will cover growing the filesystem if additional space is needed).  We will select ‘Datastore ISO File’ and browse to the ubuntu-18.04.3-server-amd64.iso file we uploaded earlier, and then confirm that ‘Connect At Power On’ is selected for the CDROM.

With the VM created, we will power it on and open the console, the installer should have begun automatically.  For the install, I mostly selected defaults, except the following:

  • User account to be created: template-admin
  • Partition disks: Guided – use entire disk (without LVM).  When using guided with LVM the volumes embed the template hostname, which will change on deployment.  We can still grow the filesystem without LVM, and other LVMs can be setup in the future, so we will continue without LVM for now.
  • Install security updates automatically.  (Setup Landscape in the future?)
  • Select the box to add openssh server

When complete the installer will disconnect (eject) the CDROM.  This is a good time to edit the VM settings to switch back to ‘Client Device’ so that no CD is attached to the resulting templates once deployed.

Login as the local user created during install.  For example, template-admin.  The IP address for ens192 (the default network adapter) should appear.  We can now SSH in so the following commands can be copied/pasted. First we will set a password for root:

template-admin@ubuntu:~$ sudo passwd root
[sudo] password for template-admin:
Enter new UNIX password: 
Retype new UNIX password: 
passwd: password updated successfully 

Switch to the root user, this will save a bit of time as we can run as this user without specifying sudo for each command. Once we are running as root, we will make sure the system is up to date.

su -
apt update && apt -y upgrade 
apt clean && apt -y autoremove --purge 

Instead of using local accounts, we will join our Ubuntu systems to Active Directory using the BeyondTrust AD Bridge Open Edition. The process is described here: https://repo.pbis.beyondtrust.com/apt.html.  Specifically we will run:

wget -O - http://repo.pbis.beyondtrust.com/apt/RPM-GPG-KEY-pbis|sudo apt-key add -
sudo wget -O /etc/apt/sources.list.d/pbiso.list http://repo.pbis.beyondtrust.com/apt/pbiso.list
sudo apt-get update
sudo apt-get install pbis-open 

Joining the domain will take a small handful of commands, so we will create a shell script for the template to help with future domain joins.  We will put this script in the root users profile directory, so we can launch it simply after customization with ./joinad.sh.  Launch a text editor such as nano joinad.sh and paste in the following text:

# The following line has the OU, Domain Name, User Account, and Password of a user with permissions to create computer objects.
/opt/pbis/bin/domainjoin-cli join --ou "LAB Servers/Services" lab.enterpriseadmins.org svc-windowsjoin VMware1!
/opt/pbis/bin/config AssumeDefaultDomain true
/opt/pbis/bin/config LoginShellTemplate /bin/bash
/opt/pbis/bin/config HomeDirTemplate %H/%U
/opt/pbis/bin/config RequireMembershipOf "lab\\domain^users"
/opt/pbis/bin/update-dns

Save the file, then make it executable

chmod +x joinad.sh

When executing the above script, a DNS record is created. In the past I’ve had issues with that record eventually being scavenged/deleted by the DNS server. To ensure that the DNS record is occasionally updated, I like to add a task to crontab. We can do this in the template by running crontab -e which will allow us to select a default editor. Once inside the editor we can add a single line like the following:

 1 1 * * 0,3 /opt/pbis/bin/update-dns 

This will schedule a task to run at 1:01 every Sunday and Wednesday. For reference, I did stumble on a great site to validate your crontab syntax at https://crontab.guru/.

The last AD related configuration to make is to add a domain group to the /etc/sudoers file, so that certain users can use sudo to run commands as root. To do this, we edit the /etc/sudoers file and add a line similar to:

 %lab^linux^sudoers   ALL=(ALL) ALL,!ROOTONLY 

This will allow members of the AD group LAB Linux Sudoers to be able to execute commands such as sudo whoami. After entering their password, they should see that they are running a command as root.

By default, netplan uses a client ID for DHCP assignments.  When using DHCP we want to use a MAC address as an identifier.  I found two sources to describe this, first https://bugs.launchpad.net/netplan/+bug/1759532, which links to this parent bug: https://bugs.launchpad.net/netplan/+bug/1738998. The recommendation is to change the netplan yaml config file to include dhcp-identifier: mac. We will do this in a text editor by running nano /etc/netplan/01-netcfg.yaml. The resulting file should look similar to this:

network:
  version: 2
  renderer: networkd
  ethernets:
    ens192:
      dhcp4: yes
      dhcp-identifier: mac

We could apply this with netplan apply, but doing so will likely result in a new IP address assignment from DHCP and a disconnect from SSH. I really only need the setting for the future, so I’ll leave this for the next reboot.

While working on this template, the VM console was at the login screen, and I accidentally hit the CTRL+ALT+DELETE button in the vSphere HTML5 client… and the VM immediately rebooted.  I tried this a couple of times, and a bit of research confirms it is a default behavior.  I want to disable that in my template, so I used the instructions here: https://www.linuxbuzz.com/disable-reboot-ctrl-alt-del-ubuntu-debian/.  Since we are already switched to the root user, we don’t need to specify sudo for each line and can run these two commands:

systemctl mask ctrl-alt-del.target
systemctl daemon-reload

For time keeping we’ll use timesyncd, so we’ll edit the config file and add our NTP servers. We’ll remove the comment from the NTP line and add our servers, separated by spaces. We can edit the file with any text editor, such as nano /etc/systemd/timesyncd.conf. After changing the file, we’ll want to make these servers active, which we can do with the following restart command:

systemctl restart systemd-timesyncd.service

Install the Log Insight agent.  I have previously downloaded the Log Insight agent installers and placed them on an internal web server.  The web server does not support .deb files, so I simply added a .zip to the end of the file name.  After downloading, we will need to rename the file back to the original name:

cd /tmp
wget http://www.example.com/vmware-log-insight-agent_8.0.0-14743436_all_192.168.45.80.deb.zip
mv vmware-log-insight-agent_8.0.0-14743436_all_192.168.45.80.deb.zip vmware-log-insight-agent_8.0.0-14743436_all_192.168.45.80.deb
dpkg -i vmware-log-insight-agent_8.0.0-14743436_all_192.168.45.80.deb

Check the configuration file to ensure it has the settings you want, for example with nano /var/lib/loginsight-agent/liagent.ini. In my case I decided to enable central_config and auto_update properties.

I used part of a script here: https://jimangel.io/post/create-a-vm-template-ubuntu-18.04/ to make sure new openssh-server keys are generated after template deployment. You should check out the original post for additional optional settings you may want to change at first boot, like a randomly generated hostname. The below text can be run from the shell:

#add check for ssh keys on reboot...regenerate if neccessary
cat << 'EOL' | sudo tee /etc/rc.local
#!/bin/sh -e
#
# rc.local
#
test -f /etc/ssh/ssh_host_dsa_key || dpkg-reconfigure openssh-server
exit 0
EOL

# make sure the script is executable
chmod +x /etc/rc.local

There are a handful of cleanup items we will want to run anytime we crack open the template for updates. Those commands are listed below:

rm -rf /tmp/*
rm -rf /var/tmp/*
rm -f /etc/ssh/ssh_host_*
history -c
shutdown -h now

Add a description to the VM note/annotation field.  This will be cloned when the VM is updated, so it will give you an idea of the starting point for all subsequent VMs.  For example, I added the following text:

2020-01-19: Ubuntu 18.04 Template, Open-VM-Tools 11.0.1,
pbis-open, Log Insight Agent

Convert template-Ubuntu1804 to a template. We now have an Ubuntu 18.04 template that is ready for use.