Any day you learn something new is a good day

Reason #703 of why I love VMware

, posted: 27-Jul-2013 15:36

A tale of two dead servers - or as I prefer to call it 'holy cr*p why won't these b*stards boot?'

Yesterday we had a scheduled power outage for our whole site that was planned to last longer than our data center UPS could stay up for. Being the paranoid engineers that we are we carefully went through all our physical & virtual boxes dotted around the campus and shut them all down.

Several of our ESXi hosts had uptime over 500 days and this was their first power off in a very long time. Needless to say I was a tad nervous about bringing them back online - even though our disaster recovery offsite backups are all good, having to recover from them would ruin my weekend.

Well, after a 50min power cut we're ready for powering everything back on.  Everything bar two ESXi hosts (of the six on site) booted fine - one host booting ESXi 4.1 off a USB stick failed on vga.z, and the other, our newest IBM x3650, wouldn't go past a blinking cursor on the screen. We eventually worked out that the fault for both of them was that the USB sticks they were booting off had decided that they'd had their last boot. It was the previous one.

Thankfully VMware is prepared for this kind of problem, especially if you're booting ESX off a USB stick - the hypervisor & the VM's themselves are mutually exclusive. (We can't afford a SAN so have local storage on each host).

For the older ESX4.1 server all I had to do was find another USB stick we had lying around already prepared and boot off it. The x3650 was ESX5 so I had to find my ESX5 cd in my drawer and install it to a USB stick, then boot off it.

Then it's just a case of recreating the appropriate virtual switches, find the datastores (ESX5 found them for us, 4.1 had to be told to go looking), find & add the VM's to the inventories and start them.  We did have a quick look at the .vmx files to confirm that we'd named the vSwitches correctly, but other than that it was find, add & start all the VM's across those two hosts.

The last thing to do was remove and re-add the hosts to vCenter, which was one of the VM's on the x3650, but that went painlessly as well.

Weekend saved.  We spent more time confirming that it was the USB sticks at fault then we did recovering from them.

Next week's jobs - purchase some 'certified for VMware' usb sticks to boot from that will last the distance, and placate Veeam backup & recovery which failed to back up last night due to the fact that while everything's the same name, they're not the same VM's anymore and it won't find them.

Other related posts:
Building A Win8.1 based Chromebook - A How To
OKI B411n & how to reset the NIC
Imaging Edubuntu

Comment by jaymz, on 30-Jul-2013 11:05

I too have had a similar experience, where ESX wouldn't boot, re-ran the installer and viola! it was all running again.  I document all the settings and these are stored with each server, so it was a breeze to get the system running again.  Install takes such a short amount of time and it has a nice option to preserve all the found datastores :)  Couple this with the EMC SANs and it is all a breeze (EMC SANs will talk through to the ESX host and create the datastores for you over iSCSI)

Add a comment

Please note: comments that are inappropriate or promotional in nature will be deleted. E-mail addresses are not displayed, but you must enter a valid e-mail address to confirm your comments.

Are you a registered Geekzone user? Login to have the fields below automatically filled in for you and to enable links in comments. If you have (or qualify to have) a Geekzone Blog then your comment will be automatically confirmed and shown in this blog post.

Your name:

Your e-mail:

Your webpage:

nzsouthernman's profile

New Zealand

This blog is mainly going to be for writing down things when I work them out so when I have to try and do it again I don't have to think too hard.  And also to comment on stuff.  Hopefully not too much rant /rant involved.

My latest finished and successful home project;

FreeNAS NAS/SAN Appliance
Celeron 2.8ghz CPU, 1GB RAM, 4x 1TB SATA drives in RAID-5 array, booting from 1GB USB flash drive

Toys in the attic;
Nokia E71-3 (Telecom XT)
iPhone 3GS (Vodafone)
MythTV separated backend with 2 DVB-S encoders & 1.2TB disk space & two frontends

Follow me on twitter;