Yesterday we had a scheduled power outage for our whole site that was planned to last longer than our data center UPS could stay up for. Being the paranoid engineers that we are we carefully went through all our physical & virtual boxes dotted around the campus and shut them all down.
Several of our ESXi hosts had uptime over 500 days and this was their first power off in a very long time. Needless to say I was a tad nervous about bringing them back online - even though our disaster recovery offsite backups are all good, having to recover from them would ruin my weekend.
Well, after a 50min power cut we're ready for powering everything back on. Everything bar two ESXi hosts (of the six on site) booted fine - one host booting ESXi 4.1 off a USB stick failed on vga.z, and the other, our newest IBM x3650, wouldn't go past a blinking cursor on the screen. We eventually worked out that the fault for both of them was that the USB sticks they were booting off had decided that they'd had their last boot. It was the previous one.
Thankfully VMware is prepared for this kind of problem, especially if you're booting ESX off a USB stick - the hypervisor & the VM's themselves are mutually exclusive. (We can't afford a SAN so have local storage on each host).
For the older ESX4.1 server all I had to do was find another USB stick we had lying around already prepared and boot off it. The x3650 was ESX5 so I had to find my ESX5 cd in my drawer and install it to a USB stick, then boot off it.
Then it's just a case of recreating the appropriate virtual switches, find the datastores (ESX5 found them for us, 4.1 had to be told to go looking), find & add the VM's to the inventories and start them. We did have a quick look at the .vmx files to confirm that we'd named the vSwitches correctly, but other than that it was find, add & start all the VM's across those two hosts.
The last thing to do was remove and re-add the hosts to vCenter, which was one of the VM's on the x3650, but that went painlessly as well.
Weekend saved. We spent more time confirming that it was the USB sticks at fault then we did recovering from them.
Next week's jobs - purchase some 'certified for VMware' usb sticks to boot from that will last the distance, and placate Veeam backup & recovery which failed to back up last night due to the fact that while everything's the same name, they're not the same VM's anymore and it won't find them.
Other related posts:
Building A Win8.1 based Chromebook - A How To
OKI B411n & how to reset the NIC
Comment by jaymz, on 30-Jul-2013 11:05
I too have had a similar experience, where ESX wouldn't boot, re-ran the installer and viola! it was all running again. I document all the settings and these are stored with each server, so it was a breeze to get the system running again. Install takes such a short amount of time and it has a nice option to preserve all the found datastores :) Couple this with the EMC SANs and it is all a breeze (EMC SANs will talk through to the ESX host and create the datastores for you over iSCSI)
Add a comment
Please note: comments that are inappropriate or promotional in nature will be deleted.
E-mail addresses are not displayed, but you must enter a valid e-mail address to confirm your comments.
Are you a registered Geekzone user? Login to have the fields below automatically filled in for you and to enable links in comments. If you have (or qualify to have) a Geekzone Blog then your comment will be automatically confirmed and shown in this blog post.