Nowadays whenever I get to open up brand-new cutting-edge gear, it feels a little like Christmas for me. The crinkle of plastic, that crisp electronics smell, the unscratched metal--it's a data center manager holiday. So, last December, when I started the installation of thirty-two blade servers in our new facility it was Christmas morning all over again.
We had arranged for an HP engineer to perform the installation, so I came a few hours early to move the more than one thousand pounds of gear from the storage room to the cage. There was only so much space inside the cage, so I had to break the giant boxes down and carry the gear load by load with a push cart.
Things were going well until I came across "the box"--one of those HP boxes that was so large you could poke a few holes in it and ship the engineer inside along with the gear. I took the lid off expecting a large blade chassis or maybe some blade servers, but instead I saw hundreds of tiny boxes in all shapes and sizes. At first I just laughed at this ridiculous contrast of box sizes, but once I identified that these tiny boxes were actually parts that belonged inside our blade servers, I resigned myself to the long day that was now ahead of me. Trip by trip, I filled up the push cart with teetering pillars of tiny boxes until the inside of the cage was full of stacks of tiny cardboard bricks.
Normally we take advantage of our vendor's integration service. Whenever we purchase a server with custom options they install the extra CPU, RAM, etc. for us so the server we unpack is ready to rack. This time, though, through some sort of misunderstanding, the integration wasn't ordered for this shipment. What this meant for me was that each additional CPU, 16GB RAM upgrade, fiber channel HBAs, extra NICs, a battery-backed write cache, and hard drives that we ordered for each blade all had to be installed manually.
When the engineer arrived we agreed the best thing was just to get started so we unboxed and racked the chassis and installed the fans and power supplies--everything practically slid into place. The installation only slowed down when we got the first blade. Since every option had to be installed manually, we worked out a sort of assembly line system to speed things up. The engineer would get a blade and open it up, and I would hand him each component to install one by one. While he finished a particular blade and slotted it into the chassis, I gathered up the growing pile of cardboard and other garbage and wheeled it out to the trash area. We repeated the same process over and over until the final tally for the day was twenty out of thirty-two blades integrated and racked.
Luckily the engineer was able to shuffle appointments and come back later in the week, coincidentally the same day our company was throwing its annual after work Christmas party. We picked up where we left off and by mid-morning we had finally finished installing all of the blades' options. It felt like the install should be over but it had really only begun: there were still sixty-four hard drives to install and integrated network pass-throughs and fiber channel switches to add. Plus, we still had to upgrade all of the firmware. One by one, we knocked tasks off the list until we finally prepared to power on all of the blades.
Any excitement I would normally have at the sight of new gear powering on for the first time was replaced with fear that we would hit some other snag. The first sixteen blades powered on and to my relief the only casualty was a single hard drive. I actually got a bit optimistic and started to pack up my things while the second set of blades was powered on. I was barely going to make the Christmas party on time. It was at this moment that I saw the dreaded Red LED of Doom on not one, but three blades.
Humbug. The engineer and I did the only thing we could do and began troubleshooting each blade. It turned out that two of the blades simply needed to be powered on again--we had probably tried to power on too many blades at once--but the third blade was still red and dead. We had to identify the component to replace, so we took the blade apart yet again, removed all of the options we had added only hours before, and added parts one at a time until we isolated a single bad DIMM slot. The engineer and I were finally done.
As I sat in rush hour traffic I had plenty of time to think over my ordeal. In retrospect, the install would have gone smoothly if we had simply been able to rack everything straight from the box. Plus, any hardware problems would have been identified beforehand when the hardware was integrated. The obvious lesson: avoid "assembly required" toys, er, servers, especially for large installs. You'll save time, frustration, and in my case, a little Christmas spirit.
ABOUT THE AUTHOR: Kyle Rankin is a systems administrator in the San Francisco Bay Area and the author of a number of books including Knoppix Hacks and Ubuntu Hacks for O'Reilly Media.
This was first published in February 2007