Rebuilding a Failed vCenter Results in More Problems
- Ben Liebowitz
At work, I had a vCenter fail and, after working with VMware support for hours, I was told to rebuild the vCenter from scratch. I downloaded the vCenter ISO for the same version as the other vCenters that are linked (7.0.3.01700) and launched the installer. I went through the wizard and selected to join it to the existing SSO Domain. After rebooting it, the majority of the services wouldn’t start. After working with support again, we found the other linked vCenters still had the failed one in the replication database. You can see it by running the command below on EACH vCenter.
/usr/lib/vmware-vmdir/bin/vdcrepadmin -f showservers -h localhost -u administrator
Sorry for the blocking in the image below. I had to remove any server/domain names.
You can see, I also have some old External PSC (Platform Service Controller) VMs in the database that I need to remove, as well as a failed vCenter and one I used a temporary IP to build.
Then we should run the same command and check the partner status and location via the commands below.
/usr/lib/vmware-vmdir/bin/vdcrepadmin -f showpartnerstatus -h localhost -u administrator
/usr/lib/vmware-vmafd/bin/vmafd-cli get-ls-location --server-name localhost
The first step… Shutdown ALL existing Linked vCenters and take Powered-Off Snapshots! BUT FIRST, lets locate which ESXi Host each vCenter lives on.
get-vm vcsa | get-vmhost
If you have multiple vCenters, you can do…
get-vm vcenter1, vcenter2, vcenter3 | select Name, VMHost
Next, if you have monitoring software, it’s time to mute alerts! After that, we can shutdown each vCenter.
shutdown-vmguest -vm vcenter1, vcenter2, vcenter3 -confirm:$false
You’ll want to take the hostname from above and them to connect via PowerShell.
If you’ve never connected to multiple vCenters before, you’ll need to run this command.
Set-PowerCLIConfiguration -DefaultVIServerMode Multiple
Now, lets connect to the VMHosts via PowerShell
$creds = get-credential #Store the domain or root creds to this variable
Connect-viserver lab01.homelab.local, lab02.homelab.local, lab03.homelab.local -credential $creds
Next, we’re going to want to take snapshots of each vCenter VM.
get-vm vcenter1, vcenter2, vcenter3 | new-snapshot -Name "Before CMSSO-Util" -Description "Taken by BLiebowitz on 12/12/23"
Then, we can start the vCenters back up again.
get-vm vcenter1, vcenter2, vcenter3 | start-vm
Once we give them a few minutes to start up, we can run the CMSSO-Util command to unregister them. Open an SSH Session to each host via Putty or whatever SSH tool you use. Login as ROOT. Then run cmsso-util. MAKE SURE YOU USE THE FULL NAME AS LISTED WITH THE VCDREPADMIN COMMAND. ALSO USE THE SSO USERNAME.
cmsso-util unregister psc01.homelab.local --username firstname.lastname@example.org
When finished, run the vdcrepadmin command again to verify all the stale entries are now gone.
Make sure you repeat this process on ALL linked vCenters, or the removed entries will replicate and come back again!
Ben Liebowitz, VCP, vExpert
NJ VMUG Leader