As part of routine maintenance, it is sometimes necessary to take an Aria Operations cluster offline. For example, it is recommended to take the cluster offline to perform backups (https://docs.vmware.com/en/VMware-Aria-Operations/8.12/Best-Practices-Operations/GUID-1D058B4A-93BA-44D1-8794-AE8E1B96B3E4.html).
Since most folks want to schedule backups, it is important to be able to leverage automation to take the cluster offline. There is an cluster management API document at https://ops.example.com/casa/api-guide.html that has some details on how to do this.
Authentication
When logging into this API, I provided the admin username/password combination. Here is an example of checking the cluster state using that method:
$creds = Get-Credential
(Invoke-RestMethod -URI https://ops.example.com/casa/sysadmin/cluster/online_state -Credential $creds).cluster_online_state_snapshot
However, I’d prefer to use a centrally managed service account in Active Directory for such tasks. The ability to do this was first introduced in vRealize Operations 8.6 (doc) and still exists in Aria Operations 8.18 (doc). It depends on a separate Active Directory configuration / definition than the one in the product UI. The links provided show where/how to configure this identity provider from the /admin
interface. Here is a screenshot showing this configuration:
Once Active Directory is configured for admin operations, we need to change our API authentication slightly to be able to use it. In the original example, we provided our username & password as a powershell credential object. In this example, we’ll end up with an extra API call to authenticate, then use the resulting bearer token as a header when checking the status. A code sample is below, but you’ll notice the authorization header that passes vrops-ldap
along with base64 encoded username (as an AD userPrincipalName), colon, and password to an authorize
resource. That resource will return a token that we’ll provide as a header to check the cluster status.
$b64 = [System.Convert]::ToBase64String([System.Text.encoding]::ASCII.GetBytes("h267-opsbu@lab.enterpriseadmins.org:VMware1!"))
$authorize = Invoke-RestMethod -Uri 'https://ops.example.com/casa/authorize' -Method Post -ContentType 'application/json' -Headers @{Authorization="vrops-ldap $b64"; Accept='application/json'}
(Invoke-RestMethod -URI https://ops.example.com/casa/sysadmin/cluster/online_state -Headers @{Authorization="Bearer $($authorize.accessToken)"; Accept='application/json'} -ContentType 'application/json').cluster_online_state_snapshot
Taking the cluster offline
With the authentication sorted out above, we can now post to this API to take the cluster offline. You’ll notice that we set the state to offline and provide a reason why. The example uses the same bearer token that we created in the above example.
$body = @{ 'online_state'='OFFLINE'; 'online_state_reason'='Lets back this thing up.'} | convertto-json
Invoke-RestMethod -URI https://ops.example.com/casa/sysadmin/cluster/online_state -Body $body -Method POST -ContentType 'application/json' -Headers @{Authorization="Bearer $($authorize.accessToken)"; Accept='application/json'}
The above example submits a request to take the cluster offline but returns immediately after doing so. In the URI we could provide a ?async=false
so that our command waits until completion. Another option would be to submit an async request (default), then create a loop to periodically check the cluster state using the prior ‘get’ request until the cluster is offline. I prefer the periodic polling option, as you can code in your own counter/timing/failure logic as needed.
If you check out the docs at /casa/api-guide.html, you’ll also see examples of setting the “Show reason on maintenance page” checkbox via the JSON body.
Bring the cluster back online
After our maintenance / backup task is complete, we’ll want to bring the cluster back online. In this example we don’t need to provide a reason in our body.
$body = @{ 'online_state'='ONLINE'} | convertto-json
Invoke-RestMethod -URI https://ops.example.com/casa/sysadmin/cluster/online_state?async=false -Body $body -Method POST -ContentType 'application/json' -Headers @{Authorization="Bearer $($authorize.accessToken)"; Accept='application/json'}
In this example I’m using the ?async=false
so that the API call doesn’t return until the cluster is back online. Again, we could opt to use the default async request and periodically poll the service if we’d like.
Conclusion
The casa
API is very useful for automating cluster management tasks. This article focuses on a few examples related to cluster state changes and authentication, but the API supports many other things, like PAK file uploads, NTP & certificate management, and even the configuration of AD authentication. You should check out /casa/api-guide.html on an Aria Operations node for more examples.