vCheck (daily report) version 5.31

As many of you know, I have created a vCheck feature request list (http://bit.ly/dGrNjh) using comments from the Virtu-Al.net site. I’ve been working through them as time permits.

In a post earlier in the week, I provided a solution to make ‘vCheck as a vSphere Client “Solutions and Application”‘. This post attempted to resolve feature request items #16 and #17. If you are interested, that post is available here: http://enterpriseadmins.org/blog/?p=258. It is not really a change to vCheck, but some steps that need to happen to make vCheck appear in vCenter.

Today, I’m am posting more updates to vCheck. This updated version includes feature requests #2, #11 and #51.

# Version 5.31- bwuch: Bug fix for LockdownMode
# Version 5.30- bwuch: Add check for VMtools installer connected
# Version 5.29- bwuch: Add check for VM capacity forecasting
# Version 5.28- bwuch: Change to Get-HTMLTable function for possible performance improvements

I’m sure update 5.28 was suggested in the Virtu-Al.net comments, but for some reason I couldn’t find it on the feature request list.

Update 5.29 isn’t perfect and I wanted to let everyone know. There are comments in the code, but I wanted to add them to this post for reference. Instead of looping through all of the virtual machines and adding up the amount of space used, I simply subtract the data store free space from the capacity and assume that is how much is being used. We also make the assumption that no more than 85% of a datastores capacity will be used (to reserve room for thin-provisioned growth, snapshots, changed block tracking and log files). These assumptions make the code run pretty fast, but I’ve seen some oddities in my test environment. (Like -41 virtual machines remaining in a data center I’m sure has enough free space for 1 or 2 more VMs 🙂 ) Here is what I’ve added to the comments of the script for reference:

# The disk forecast will be per datacenter instead of per cluster since
# Get-Datastore -Entity only supports VirtualMachine, VMHost, and Datacenter objects.
# To improve performance in code, we are going to make the following assumptions
#   Assumption 1.) Disk capacity - Free Space = space used by VMs
#   Assumption 2.) used space / count of VMs = Avg Space used per VM
#   Assumption 3.) we will reserve 15% of capacity for overhead

If anyone else has a better solution (fast and accurate) please let me know.

Here is version 5.31 for those interested: Download vCheck5.31.ps1.txt

20 Comments

  1. I have added the following:

    Additional Check to see if VM’s are using the VMXNET3 vNic

    # —- VM vNic Issues —-
    If ($ShowvNicIssues){
    Write-CustomOut “..Checking VM vNic Issues”
    $FailvNIC = $VM |Where {$_.Guest.State -eq “Running” -And (($_.Guest.Nics| %{$_.Device.Type}) -ne “VMXNET3″)} |select name,@{N=”IP addr”;E={[string]::Join(‘,’,$_.Guest.IPAddress)}},@{N=”NIC type”;E={[string]::Join(‘,’,($_.Guest.Nics | %{$_.Device.Type}))}} -ErrorAction SilentlyContinue|sort Name
    If (($FailvNIC | Measure-Object).count -gt 0 -or $ShowAllHeaders) {
    $MyReport += Get-CustomHeader “VM vNic Issues : $($FailTools.count)” “The following VMs are not runnng VMXNET3 vNic, these should be checked and corrected if necessary”
    $MyReport += Get-HTMLTable $FailvNic
    $MyReport += Get-CustomHeaderClose
    }
    }
    #***** End Code *****

    Also – if you change the vSwitch Port Check to
    foreach ($vswitch in ($vhost|Get-VirtualSwitch -Standard))
    there will be less false positives on Distributed Virtual Switches.

  2. Hi, first of all I have been using the report and absolutely think it is fantastic.

    I was wondering if you had any thoughts as to how it could be targeted at multiple Virtual Centres to produce one report.

    Thanks again for all your time spent on this.

    Regards
    Sid

  3. Thank you for your comments. You are correct – this report is fantastic however, I cannot take credit for it. Anything I’ve done is just a thin layer of icing on the awesome cake provided by Virtu-al.net.

    As for support for multiple vCenters, it is something I would like to add. There are two ways I can think of implementing this…one would group all of the checks together and add the vCenter name to each table in the results. The other (simpler) way would involve just grouping all of the vCenter results together, as if you had a huge vCenter. You loose granularity in the reports but it would be much easier to implement. I believe some variation on the following command:

    Set-PowerCLIConfiguration -DefaultVIServerMode multiple -Confirm:$false

    would allow multiple vCenters to be added and then included in the check.

    The only problem is that I’ve recently moved to a new house. I’ve been working with my local telephone provider (AT&T) to get internet access but it is taking MUCH longer than anticipated. Until I have internet access my development efforts are on hold. Please feel free to leave comments and let the collaboration roll…I’m hoping to get back to work on this within the next month or two.

  4. Have a look at this looks better than your
    $MBFree = 1.5GB / 1MB #1500
    $ShowGuestDriveSpace = $true
    # drive size
    If ($ShowGuestDriveSpace)
    {
    $MBFreeDisplay =[math]::round(($MBFree * 1MB) / 1.0GB,2)

    Write-CustomOut “..Checking for Guests with drives less than $MBFree MB”
    $MyCollection = @()
    $AllVMs = $FullVM | Where {-not $_.Config.Template } | Where { $_.Runtime.PowerState -eq “poweredOn” -And ($_.Guest.toolsStatus -ne “toolsNotInstalled” -And $_.Guest.ToolsStatus -ne “toolsNotRunning”)}
    $SortedVMs = $AllVMs | Select *, @{N=”NumDisks”;E={@($_.Guest.Disk.Length)}} | Sort-Object -Descending NumDisks
    ForEach ($VMdsk in $SortedVMs)
    {
    Foreach ($disk in $VMdsk.Guest.Disk)
    {
    if (([math]::Round($disk.FreeSpace/ 1MB)) -lt $MBFree)
    {
    $Details = New-object PSObject
    $Details | Add-Member -Name Name -Value $VMdsk.name -Membertype NoteProperty
    $Details | Add-Member -Name “Disk Drive” -MemberType NoteProperty -Value $Disk.DiskPath
    $Details | Add-Member -Name “Disk Capacity(GB)” -MemberType NoteProperty -Value ([math]::Round($disk.Capacity/ 1GB,2))
    $Details | Add-Member -Name “Disk FreeSpace(GB)” -MemberType NoteProperty -Value ([math]::Round($disk.FreeSpace / 1GB,2))
    $MyCollection += $Details
    }
    }
    }
    If (($MyCollection | Measure-Object).count -gt 0)
    {
    $MyReport += Get-CustomHeader “VMs with Disk Drives less than $MBFreeDisplay GB : $($MyCollection.count)” “The following guests have a Disk Drive less than $MBFreeDisplay GB Free. If the disk fills up it may cause issues with the guest Operating System”
    $MyReport += Get-HTMLTable $MyCollection
    $MyReport += Get-CustomHeaderClose
    }
    }

    What is the 0 all about?
    Name Disk0path Disk0Capacity(MB) Disk0FreeSpace(MB)
    XXXXX D:\ 102399 475
    XXXXX C:\ 8182 1463

  5. How about adding a alert for misconfigured limits, like Number of VMs with resource limits ?

    IE mem configured for 4gb and limit set to 2gb in error

  6. How long does it run for? I ran the script and it is stuck at checking “capacity info” for more than 16 hours. It would be nice if there is some kind of logging to find whats going on… Is it normal to be collecting capacity info for 16 hours? I have around 20 ESX and a handful of datastores.

  7. I run this script on an environment with around 30 hosts and 800 virtual machines in about 2.5 hours. The ‘capacity info’ section you refer to does take the longest of any check. I believe if has to do with the fact that it looks at statistics with Get-Stats. However, if I remember correctly, it only takes about 30 to 35 minutes in my tests. This could vary greatly — especially if you have adjusted the stat levels.

    If you haven’t already, you may want to try upgrading to PowerCLI 4.1.1. It does appear to run faster than previous versions.

  8. @Kevin – There should already be a section for VMs with resource limits. It doesn’t actually check for a mismatch; if a VM has 2GB of RAM and a 2GB limit it would also appear on the report. What I prefer is to set VMs to unlimited; if a VM has a limit that matches the allocation and a junior admin increases the RAM you might end up with a mismatch. I’d prefer to get in front of the issue, remove the limit and not have to worry in the future.

    This section was one of the first things I added to the report (around version 5.03). The code was actually provided in a comment over at Virtu-Al.net.

  9. Hi Brian,

    I am on 4.1.1 and have noticed it is just stuck at collecting the “capacity info”. I did not understand your comment about stat level. how can I find it? Will that affect the report to run for days???

  10. @Alok, here is an article that talks about changing the statistics level from the default of 1 to something higher (2, 3 or 4): http://vmguy.com/wordpress/index.php/archives/401. The higher the value the more statistics available…which I would think would cause Get-Stat to run longer (it is by far the most time consuming cmdlet I’ve used).

    Just curious, 1.) do your clusters have HA enabled (I’ve never tested this script against a cluster without HA/DRS, so I’m not sure how it would behave.) and 2.) do you have a large number of clusters? Looking at the code, I could see where the more clusters you have the longer it would take to complete. However, I run this against 5 clusters and the capacity planner section only takes ~45 minutes to complete.

    I could make a new script that has just the capacity planner section with a lot more output (a debug version). Would you be interested in running that to help isolate where the problems come from?

  11. Hi,

    I think the script is great, but I have one problem with it.

    The resulting html-file only shows the headers of each section.

    For example:

    Snapshots (Over 14 Days Old) : 5
    VMware snapshots which are kept for a long period of time may cause issues, filling up datastores and also may impact performance of the virtual machine.

    No information is shown which vms actually do have snapshots. This problem appears with every test this script makes.

    Before I switch to v5.36 I used v5.00, which still works fine.

    Do you have an idea, why the script behaves like this in my environment. (I use Powercli 4.1.1)

    Best regards,

    Martin

  12. @Martin,

    I believe the issue you identified is a feature ‘5.10: Added option to include all headers — even on tests that return no results’

    Depending on the audience of your report, someone could say ‘I wish this report would check for snapshots over 14 days.’ Even though the report is already checking, certain people may not know everything that has been checked (and excluded due to no results).

    If you would like to see the error only html report (like version 5.00) you can change line 167 of the vCheck5.ps1 from:
    $ShowAllHeaders = $true
    to:
    $ShowAllHeaders = $false

    This will switch the behavior back to the version 5.00 format.

    Thanks,
    Brian

  13. @Brian: I think I know the problem now. Our clusters don’t have HA/DRS enabled for some reason. I inherited them 🙂 )

    is there a possibility to test or have a version that can run on these cluster? Or do I just live without using this awesome script 🙁

    It may take some time before I can get my change control managers to approve and implement this. Any likelihood of having a non-HA script (if its not too much work)? 🙂

    Thanks a lot.

  14. @Alok: I’ll look into the capacity planner section of the script and see what we can do about handling the non-HA clusters. It might take me a bit to setup a virtual cluster without HA for testing — but hopefully not as long and painful as a change control review 🙂

    Until I get that all worked out, there is a workaround that will let you exclude that section of the vCheck — change line #283 from
    $ShowCapacityInfo = $true

    to
    $ShowCapacityInfo = $false

    This will exclude the capacity check but should allow the script to keep processing and give you a report.

  15. Brian,
    You da man!!!
    Thanks a lot for the workaround as well as the willingness to work on the non-HA cluster script. Genuinely appreciate it!

    I had to disable a few more options related to Cluster in multiple runs and finally got it completed. The report did show up but for a lot of those I did not see any data. For Ex:
    Host uptime warnings : 17The following hosts have uptime over 60 days (and may require security patches) or under 3 days (recently added/rebooted).

    But there is nothing below that. Any ideas?

  16. I’m currently running v5, downloaded 5.31 today and tried to run. Seems to take the same period of time to run, but when the report comes up all I have is subject titles and no detail information except for top title information like number of hosts, vms, etc….

  17. Hi, I just found the updated vCheck Script on your site. I was still running V5 from vitu-al. Maybe I have an update for the “VCB Garbage” Check.
    We are using Symantec Backup Exec 2010 for Snapshot backup. It creates Snapshots with the Name “SYMC*”. To check if there are any left over after backup I added “SYMC” to line 2014 in the Script.
    It looks like:
    $VCBGarbage = $VM |where { (Get-Snapshot -VM $_).name -contains “VCB|Consolidate|veeam|SYMC” } |sort name |select name

    If have not tested this change, because there are not very often orphaned Snapshots from Symantec.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Notify me of followup comments via e-mail. You can also subscribe without commenting.