I came back from an awesome week at VMworld to find a very odd networking issue on several production ESXi 4.1 hosts. I first noticed that the hosts in question had an HA warning, so I attempted to ‘Reconfigure for VMware HA’ which resulted in the following error message:
Cannot complete the configuration of the HA agent on the host. Misconfiguration in the host network setup.
I decided to place one of the hosts into maintenance mode while I investigated this issue. This failed with several warnings during the vmotion operation(s):
The vMotion interface is not configured (or is misconfigured) on the Source host. Currently connected network interface 'Production' uses network 'Production' which is configured for different offload or security policies on the destination host than on the source host. The vMotion interface of the destination host uses network '<unknown network>', which differs from the network '<unknown network>' used by the vMotion interface of the source host.
I then decided to check out the networking configuration to see what was going on. I clicked on the host configuration tab, then Networking and waiting. I never got more than the following screen — even after selecting ‘refresh’
Thinking something wasn’t displaying correctly, I moved to the Network Adapter link. What I found there was even more alarming:
Just for clarification, all six of those network interfaces should be assigned to vSwitches! The oddest part was the hosts still had active, running and responding virtual machines, but the host had no visible signs of network configuration.
I then pointed my vSphere client directly at the host, thinking something was wrong in vCenter. No such luck, I received the same results. I enabled tech support mode and logged in through ssh. I then listed all of the vswitches using the following command:
Fortunately, all of the vSwitches were still in tact — which explains how/why the VMs were still online. I then checked the esx config file to see if my NICs and portgroups were still properly defined:
cat /etc/vmware/esx.conf |grep -i nic cat /etc/vmware/esx.conf |grep -i portgroup
Since everything was in order I went to the DCUI on the console and restarted the management agents on one host. A few seconds later everything was back in working order and I was able to re-enable HA. This was a very simple fix, but it is one of the weirdest network issues I’ve ever seen on an ESXi host.
I had the exact same issue on one of my ESXi 4.1 hosts after upgrading vCenter to version 5.0.
However, restarting the management agents from the DCUI didn’t work. I had to access the hidden console (Alt-F1) and restart agents from the command line. ie: /sbin/services.sh restart
Thanks for your article to point me to the fix.
Whew, I lost all of my networking after trying to add another port group, this post helped me, I had to do what GuyMontag did from the command line. Thanks for this post, this issue really had me sweating.
Thank You…Sooooo much !
I was scratching my head for the same issue around 4 hours, it was very usefull and resolved the issue.
Thanks for sharing 🙂
Good one mate..worked for me
Thank you this was a savior !