Troubleshooting vCenter permission errors

I was recently helping troubleshoot an issue where a service account was configured with the least privileges possible. When the service attempted to perform a specific operation, an access denied message was encountered. The service performing this action immediately cleaned up after itself, deleting the virtual machine that was created.

Typically in the UI we can see a warning event on an object when a required privilege is missing. For example, in the following screenshot a read only service account attempted to change the CPU Count for a virtual machine. This operation failed due to a missing permission, but we can clearly see the missing privilege is VirtualMachine.Config.CPUCount.

However, in our specific case the affected object was destroyed automatically, and we didn’t have an opportunity to view this event in the UI on our specific VM. We could have likely found this event on a parent object, but the environment had a lot of events occurring, making it difficult to find in the UI. Instead, we used PowerCLI to filter the logs for what we needed. In this sample we are using Get-VIEvent to query for all events in the last 15 minutes, then filter on the client side where the event text contains my service account and the Event Type is the specific NoPermission event we were interested in.

Get-VIEvent -Types Warning -Start (Get-Date).AddMinutes(-15) -Finish (Get-Date) | 
Where-Object {$_.FullFormattedMessage -match 'svc-vspherero' -AND 
       $_.EventTypeId -eq 'com.vmware.vc.authorization.NoPermission'} |
Select-Object FullFormattedMessage, ObjectName

This worked for our case. Another option would be to use the Get-VIEventPlus function from here: https://www.lucd.info/2013/03/31/get-the-vmotionsvmotion-history/. Using this custom function, a EventFilterSpec is created where we can have vCenter only return the NoPermission events. This is more efficient than doing the query on the client side and lets us return more applicable events. Using the sample below, we can group our NoPermission events and see how many times they occurred.

$allNoPerm = Get-VIEventPlus -EventType com.vmware.vc.authorization.NoPermission
$allNoPerm | Group-Object -Property FullFormattedMessage | 
  select-object Name, Count | Sort-Object -Property Count -Descending

For example, using the above code in my lab I found some additional service accounts that were missing required permissions. I was able to review documentation and confirm that the required privileges were updated between when I initially created the custom role and the current version of the service.

I hope these sample queries help identify missing privileges if needed.

Posted in Scripting, Virtualization | Leave a comment

Linux P2V with vCenter Converter Standalone

I was recently speaking with someone who was encountering an Access Denied issue when trying to P2V an older CentOS physical machine. The specific error they were getting was The thumbprint of the remote host you are connecting to is: Access denied. Screenshot below for reference:

I was able to recreate this error in a lab, in this post we’ll explore the cause of the problem and a couple of possible solutions.

Looking at the documentation (https://docs.vmware.com/en/vCenter-Converter-Standalone/6.6/vcenter-converter/GUID-E6C55568-EE61-4D1F-A3DC-71269790D9FD.html), we see a few prerequisites to prepare the source Linux machine. The first two are:

  • Enable SSH on the source Linux machine.
  • Make sure that you use the root account or account with sudo privileges without password prompt for all commands to convert a powered on Linux machine.

The initial error of “The thumbprint of the remote host you are connecting to is: Access denied.” occurred when using the root account for a host where PermitRootLogin was set to no.

There are two possible solutions for this issue, described in that second bullet. We can either get the root account functional (over SSH), or switch to a user with sudo access.

Enabling SSH for the root user

Many sshd_config files have the ability to login as root over SSH disabled by default. You typically need to login as a regular user and then switch to root (su -) or prefix commands with sudo. To keep things as secure as possible, while temporarily satisfying this requirement, we can add a couple of lines to the very end of our sshd_config file:

Match Address 192.168.127.194,192.168.127.81
     PermitRootLogin yes

Once we’ve saved our config file changes, we’ll reload the SSHD config to make this change active. For this older CentOS test system, the command is service sshd reload.

This allows us to enable root login over SSH, but only for two client systems. The first IP I have listed is the Windows system where I’m running the VMware vCenter Converter Standalone client. This client logs in to view the source machine details while we are configuring the job. The second IP address listed is the temporary IP address assigned to the helper VM. I’m using a static IP address for the helper VM, which allows me to put that specific IP address in my sshd config. If I were using a DHCP address, I could list the subnet instead, for example: 192.168.127.0/24. For reference, failure to list this helper VM address results in the error Permission denied, please try again.

Once the P2V migration is complete, we can edit the /etc/ssh/sshd_config file again, this time removing the two lines added. We’ll reload the config, again with service sshd reload.

sudo without password prompt

If the preference is to designate a specific account to use instead of root, that can be done instead. The account will need to be able to execute all commands with sudo privileges without password. In this example, we will create a specific user just to do the P2V:

sudo useradd myp2vuser
sudo passwd myp2vuser # set a password for this user

Once our specific user is created, we’ll edit the /etc/sudoers file to add the following line:

myp2vuser  ALL=(ALL)  NOPASSWD: ALL

Once we’ve saved the changes to our sudoers file, we’ll attempt to login over ssh. We should be able to login without issue. To test that sudo is working, I like to run a command like cat /etc/shadow. It should fail with a permission denied error. We can then run sudo cat /etc/shadow, which should work without being prompted for a password.

Once the migration is complete, we can remove our temporary user with the command userdel myp2vuser and remove the entry added to the /etc/sudoers file.

Conclusion

With either of the above options from the product documentation, we are able to complete a P2V migration without the ‘Access Denied’ error previously reported. Hopefully this helps if you run into a similar issue!

Posted in Virtualization | Leave a comment

PowerCLI 13.3, Scheduled Snapshot Removal, and Privilege Report

PowerCLI 13.3 was recently released (https://blogs.vmware.com/PowerCLI/2024/07/introducing-powercli-13-3.html). This release has a handful of really good features that we’ll explore below.

vSphere modules updated to vSphere 8.0U3

The vSphere modules in PowerCLI 13.3 have been updated to support vSphere 8.0U3 features. One of the 8.0U3 features I was interested in automating was the scheduled deletion of a virtual machine snapshot. In the vCenter Server UI > Developer Tools > Code capture could show us the code needed to schedule this new feature, for example:

$entity = New-Object VMware.Vim.ManagedObjectReference
$entity.Type = 'VirtualMachine'
$entity.Value = 'vm-39'
$spec = New-Object VMware.Vim.ScheduledTaskSpec
$spec.Scheduler = New-Object VMware.Vim.HourlyTaskScheduler
$spec.Scheduler.ActiveTime = [System.DateTime]::Parse('08/07/2024 17:02:00')
$spec.Scheduler.Interval = 1
$spec.Scheduler.Minute = 2
$spec.Notification = 'bwuchner@example.com'
$spec.Name = 'test-tc-09 - Hourly delete snapshot schedule'
$spec.Action = New-Object VMware.Vim.MethodAction
$spec.Action.Argument = New-Object VMware.Vim.MethodActionArgument[] (2)
$spec.Action.Argument[0] = New-Object VMware.Vim.MethodActionArgument
$spec.Action.Argument[1] = New-Object VMware.Vim.MethodActionArgument
$spec.Action.Argument[1].Value = New-Object VMware.Vim.SnapshotSelectionSpec
$spec.Action.Argument[1].Value.RetentionDays = 3
$spec.Action.Name = 'RemoveAllSnapshots_Task'
$spec.Description = ''
$spec.Enabled = $true
$_this = Get-View -Id 'ScheduledTaskManager-ScheduledTaskManager'
$_this.CreateScheduledTask($entity, $spec)

However, running the above with prior versions of PowerCLI (ie 13.2 or earlier) resulted in the following errors:

New-Object : Cannot find type [VMware.Vim.SnapshotSelectionSpec]: verify that the assembly containing this type is
loaded.
At line:1 char:34
+ ... ction.Argument[1].Value = New-Object VMware.Vim.SnapshotSelectionSpec
+                               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidType: (:) [New-Object], PSArgumentException
    + FullyQualifiedErrorId : TypeNotFound,Microsoft.PowerShell.Commands.NewObjectCommand


The property 'RetentionDays' cannot be found on this object. Verify that the property exists and can be set.
At line:1 char:1
+ $spec.Action.Argument[1].Value.RetentionDays = 3
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (:) [], RuntimeException
    + FullyQualifiedErrorId : PropertyNotFound

However, with PowerCLI 13.3 this code works as expected and returns the managed object reference ID of the created schedule:

Type          Value
----          -----
ScheduledTask schedule-203

Looking in the vSphere Server UI we can also confirm that the scheduled task exists.

Get-VIPrivilegeReport

William Lam had previously posted about using the List Privilege Check API introduced in vSphere 8.0U1. This API would show the minimum required permissions to do specific actions, which is especially helpful if we want to use a service account with least privilege to run an automated task. PowerCLI 13.3 introduces the ability to easily call this API for a specific code block.

As an example, I used the following:

$pr = Get-VIPrivilegeReport {
# above code for scheduled deletion of a virtual machine snapshot
}

This created a $pr variable with a privilege report for all the privileges that would be required to schedule such a task. If we look at the $pr variable we’ll see:

EntityId                 Principal    Privilege                           Server
--------                 ---------    ---------                           ------
vim.Folder-group-d1      bwuchner@lab System.Read                         vc3
vim.Folder-group-d1      bwuchner@lab System.View                         vc3
vim.VirtualMachine-vm-39 bwuchner@lab ScheduledTask.Create                vc3
vim.VirtualMachine-vm-39 bwuchner@lab System.Read                         vc3
vim.VirtualMachine-vm-39 bwuchner@lab VirtualMachine.State.RemoveSnapshot vc3

Using this list, we can see that in addition to the default Read/View access, our service account will need ScheduledTask.Create and VirtualMachine.State.RemoveSnapshot privileges in our custom role.

Combining code capture, PowerCLI 13.3, and Get-VIPrivilegeReport can provide an experience that is much easier than the trial and error approach from the past.

Posted in Scripting, Virtualization | Leave a comment

Monitoring MongoDB Enterprise using LDAP Authentication

I had a recent need to dig into MongoDB monitoring with Aria Operations. In those posts, I used a preconfigured Bitnami MongoDB virtual appliance. This virtual appliance used MongoDB Community Edition. As a follow-up, I was looking into if it were possible to use an active directory user for monitoring instead of the local user from the previous post.

Looking into this question, I learned that it is possible… but there are a few requirements. This post will explain how to configure MongoDB to use an active directory user account for authentication, specifically for the Aria Operations management pack.

MongoDB has a community edition, which was installed in my previous appliance. If we look at the ‘ldap’ configuration options documented here: https://www.mongodb.com/docs/manual/reference/configuration-options/, we’ll notice that those settings say “Available in MongoDB Enterprise only.”

The MongoDB folks have some really good documentation on installing mongodb-enterprise available here: https://www.mongodb.com/docs/manual/tutorial/install-mongodb-enterprise-on-ubuntu/#std-label-install-mdb-enterprise-ubuntu. I choose the Ubuntu version of this document as I already had Ubuntu template VMs available in my lab. The installation went very smooth, I’ll include the commands I ran below as a quick reference. I had already su - and was running as root, so I didn’t require he sudo from the above example.

apt-get install gnupg curl
# no change, these packages already in image

curl -fsSL https://pgp.mongodb.com/server-7.0.asc | \
   gpg -o /usr/share/keyrings/mongodb-server-7.0.gpg \
   --dearmor

echo "deb [ arch=amd64,arm64 signed-by=/usr/share/keyrings/mongodb-server-7.0.gpg ] http://repo.mongodb.com/apt/ubuntu focal/mongodb-enterprise/7.0 multiverse" | tee /etc/apt/sources.list.d/mongodb-enterprise-7.0.list

apt-get update

apt-get install -y mongodb-enterprise

ps --no-headers -o comm 1
# returns systemd

sudo systemctl start mongod

With a working MongoDB Enterprise installation, I was able to start configuring LDAP/Active Directory authentication. MongoDB docs discuss using groups only to delegate roles, so I created two objects in Active Directory:

Group: CN=LAB MongoDB Ent Monitoring,OU=LAB Service Accounts,DC=lab,DC=enterpriseadmins,DC=org
User: CN=svc-mgdbeops,OU=LAB Service Accounts,DC=lab,DC=enterpriseadmins,DC=org

This directory has an existing service account used for generic binds. I’m going to re-use this account: CN=svc-ldapbind,OU=LAB Service Accounts,DC=lab,DC=enterpriseadmins,DC=org. In the real world the MongoDB admins would likely have their own service account for this purpose, or perhaps create a unique account per environment.

The first step is to grant my group limited access to the instance. I’ve also decided to create a local root user to use for administration, if needed. We’ll do all this using the mongosh command directly on the appliance.

mongosh # no credentials were required

var admin = db.getSiblingDB("admin")

# create a root user to have, just in case
admin.createUser(
   {
       user: "root", 
       pwd: "VMware1!", 
       roles:["root"]
   })
# returns: { ok: 1 }

# give our AD service account limited access
admin.createRole(
    {
        role: "CN=LAB MongoDB Ent Monitoring,OU=LAB Service Accounts,DC=lab,DC=enterpriseadmins,DC=org",
        roles: [ { role: "clusterMonitor", db: "admin" } ]
    }
)
# returns: { ok: 1 }

With our permissions delegated, we now need to update our /etc/mongod.conf to make it aware of our directory. We’ll make to edits to this default file. First, in the network interfaces section, we’ll change the entry that binds to localhost only to allow binding to all IPs. I left the previous configuration as a comment, so I could revert back easily if needed. The change looks like this in my config:

# network interfaces
net:
  port: 27017
  bindIpAll: true
  #bindIp: 127.0.0.1 

We’ll continue to the end of the file. I do not have security: or setParameter: sections, so I will create them:

security:
   authorization: "enabled"
   ldap:
      servers: "core-control-21.lab.enterpriseadmins.org:389"
      bind:
         queryUser: "CN=svc-ldapbind,OU=LAB Service Accounts,DC=lab,DC=enterpriseadmins,DC=org"
         queryPassword: "VMware1!"
      transportSecurity: "none"
      authz:
         queryTemplate: "{USER}?memberOf?base"
      validateLDAPServerConfig: true
setParameter:
   authenticationMechanisms: "PLAIN"

MongoDB documentation has additional configuration options for userToDNMapping, but I’m not using those and opting instead to just pass the distinguished name as the user name.

With the configuraiton file update, I restarted the mongod service and confirmed that it was running with the following syntax:

sudo systemctl restart mongod.service
sudo systemctl status mongod.service

Finally, in Aria Operations I was able to configure the adapter instance to use this LDAP credential. For the adapter name, I entered the short host name of the monitored server, and for the host attribute I used the fully qualified domain name. When creating the credential, I entered my user distinguished name, as well as selected ‘LDAP SASL’ for the type of authentication. I’ve included a screenshot below for reference:

With these properties configured, I was able to create the configuration of the adapter instance.

After a few minutes, the dashboards begin populating with the mongod information.

Hopefully this post helps with configuration of MongoDB Enterprise LDAP authentication.

Previous MongoDB posts:

Posted in Lab Infrastructure, Virtualization | Leave a comment

Event Driven Automation: Domain Controller FSMO role has moved

In a prior post (https://enterpriseadmins.org/blog/scripting/event-driven-automation-with-aria-operations-for-logs-webhooks-and-jenkins/), I wrote about using Aria Operations for Logs to call Jenkins using a webhook to automate a process. In that post, Aria Operations for Logs is watching for a specific vSphere event and when that occurs (VM deleted), we check a couple other systems (Active Directory & Aria Automation Config) and make sure the cleanup activates cascade to those other systems.

In this article we are going to do the same type of thing (one event triggers another), but instead of looking a vSphere events, we are going to use an event that comes from an agent deployed to Windows systems, specifically domain controllers, when a flexible single-master operator (FSMO) role is moved. As with the previous example, we could really look for any event, this is just something that is easy to trigger for demonstration purposes.

In my experience, these FSMO roles are only rarely moved in a stable production environment. However, when the PDC Emulator role is moved it is possible that some firewall rules may need to be modified. This is because all clients will send requests to the system holding this role in the event of a bad password attempt (https://learn.microsoft.com/en-us/troubleshoot/windows-server/active-directory/fsmo-roles). Since this type of change occurs rather infrequently it is something that can be easily forgotten. To help with this, we are going to create a similar event driven approach to alert on these events and start the appropriate actions.

Aria Operations for Logs – webhook payload

This webhook will be identical to the payload used for our previous example, with one important change. We’ve updated the ‘token’ portion of the URL we will be posting to.

Aria Operations for Logs – alert definition

The Windows event entry for this is event ID 1458, so we’ll look for that happening. Our alert definition will be very simple:

For testing, I’m using the same ‘Real Time’ query which executes every minute. Due to the infrequent nature of this event, we could likely scale this back quite a bit. However, for building the rule & testing its execution having it run more frequently was helpful.

Jenkins – New Freestyle project

As with the previous example, Jenkins is going to receive this alert via webhook and then run our custom script.

With the FSMO role moves, we will actually receive two events — one from the domain controller which previously held the role, and another event from the domain controller receiving the role. To prevent duplicate actions from triggering, the logic of our code is only going to take action for the event received from the domain controller receiving the role. This will help reduce alert fatigue.

Aria Operations for Logs is sending an event message which contains the event text as well as some additional extracted fields. In our previous post, we only needed to read the text of the message, but in this case we also need the extracted field showing which system was the source of our event. To make this happen, our post content parameter we’ll call logmsg but make the expression be the JSONPath $[0], which is the full event message and not just the text field used in our previous example.

As an example, our code looks like this… again it is just an example for illustration purposes. We could do anything we want/need with this event.

$jsonEvent = $env:logmsg | ConvertFrom-JSON

$previousOwnerCN = [regex]::Match( $jsonEvent.text.split("`n")[7], ",CN=(.*?),CN=Servers").Groups[1].Value.Trim()
$newOwnerCN = [regex]::Match( $jsonEvent.text.split("`n")[5], ",CN=(.*?),CN=Servers").Groups[1].Value.Trim()
$eventSourceHost = ($jsonEvent.fields | ?{$_.name -eq 'hostname'}).content.Trim()
$requestuser = ($jsonEvent.fields | ?{$_.name -eq 'userid'}).content.Trim()

 # Both domain controllers involved in the transfer will send an event message.  To eliminate duplicate actions, 
 # we will only action the event which comes from the new owner of the FSMO role.
if ($eventSourceHost -match $newOwnerCN) {
  $msgObject = $jsonEvent.text.split("`n")[3]
  $dnAsFqdn  = ($msgObject -replace '^.*?,DC=' -replace ',DC=','.').Trim()

  $movedRole = switch ($msgObject) {
    { $_ -match '^DC=' } { 'PDC Emulator - domain: ' + $dnAsFqdn }
    { $_ -match '^CN=RID Manager'} { 'RID Master - domain: ' + $dnAsFqdn }
    { $_ -match '^CN=Infrastructure'} { 'Infrastructure Master - domain: ' + $dnAsFqdn }
    { $_ -match '^CN=Schema'} { 'Schema Master - forest: ' + $dnAsFqdn }
    { $_ -match '^CN=Partitions'} { 'Domain Naming Master - forest: ' + $dnAsFqdn }
  }
  
  $friendlyMessage = "The role $movedRole has been moved to $newOwnerCN from $previousOwnerCN by $($requestuser)."
  if ($friendlyMessage -match 'PDC Emulator') { 
    $destIP = ([System.Net.Dns]::GetHostAddresses($eventSourceHost) | ?{$_.AddressFamily -eq 'InterNetwork'})[0].IPAddressToString
    $fwcrSubmitt = New-FirewallChangeRequest -requestor $requestuser -hostGroup 'Win-AD-Services' -changeType 'add' -ipAddress $destIP
    $friendlyMessage += '  For this role, additional firewall changes may be needed.  To help ensure this has not been overlooked, firewall change request '+$fwcrSubmitt+' has been submitted.'
  }
  $friendlyMessage
  
  Send-MailMessage -to 'bwuchner@example.com' -from 'jenkins@example.com' -Subject 'Domain Controller: FSMO Role moved' -Body $friendlyMessage -SmtpServer 'mail.example.com'
} else {
  "This event is being skipped as the eventSourceHost was $eventSourceHost which is not the newOwner $($newOwnerCN)."; exit 1
}

Note: the New-FirewallChangeRequest function is not covered by this post, but it is an example custom function to submit an internal form

Testing

Testing this process was rather easy in a lab. From Active Directory Users and Computers, you can change the operations master, moving it from one domain controller to another. Each move should trigger two Jenkins builds with only one sending an email notification. The other build will exit as an error before a message occurs. In the case of a PDC Emulator move, the email text has a bit more detail and a firewall change request gets submitted. Here is a sample of the email notification sent for a PDC Emulator move. This text is also logged as ‘console output’ for the Jenkins build.

The role PDC Emulator - domain: enterpriseadmins.org has been moved to DR-CONTROL-21 from CORE-CONTROL-21 by LAB\bwuchner.  For this role, additional firewall changes may be needed.  To help ensure this has not been overlooked, firewall change request 123d1fd1-895a-47e9-9779-88dc33a32a16 has been submitted.
Finished: SUCCESS

Conclusion

This is another example of using Aria Operations for Logs to trigger a specific event to drive automation. With this webhook alert allowing us to connect the two systems, any event that occurs, whether vSphere, Windows agent, syslog stream, or anything else, can be used as the starting point to take action or send very specific notifications to any system we choose.

Posted in Lab Infrastructure, Scripting | Leave a comment