And the remote management tools are evolving. Many servers now ship with a wonderful integrated feature generically called lights-out management (LOM). Usually, a server series will give it a proprietary name. For example, Dell's PowerEdge servers operate using OpenManage and Hewlett-Packard Co. touts the Integrated Lights Out (iLO) feature set on its ProLiant servers. While these embedded pieces of software and hardware may be diverse and feature-rich, it's imperative to understand the pitfalls that come along with these technologies.
Where remote server management fails
First, the obvious: Lights-out management tools will only work if there is power on your server. Of course, you have redundant power supplies on your mission-critical machines, but if both of those power units are dead, you're out of luck. At this point, it's time to call your remote server manager at the site to check the problem. Aside from that, your LOM tools can help you in a bind.
As a remote administrator, what do you do to prepare for some serious off-site work? Proper disaster recovery management and planning will help you deal with and resolve small issues that relate to your remote servers. There are three core problems (aside from the obvious complete power failure scenario) when remote management is concerned.
First, you're always relying on the Internet. If the Internet goes down, remote management is also down. Sometimes users will use dial-up in emergency cases. But if you've ever had to administer a server over dial-up, you know it's no picnic. Many administrators who must have remote locations and cannot be without Internet service will go the redundant route. If you choose to create a secondary Internet line into your facility, make sure to select a different Internet service provider using a different line -- just because you are using two different ISPs does not mean they aren't on the same pipe, so verify that these are completely separate lines going through completely separate carriers. One way to accomplish this would be have one carrier as a T1/T3 line and the other a business-class cable connection. Going even further, some uptime-focused administrators will also follow up with a redundant ISP infrastructure that may include a cellular solution, satellite and, as a last resort, dial-up.
Next, you can only manage what you can see, so employ a tool that will provide the features and interface that is best-suited to your requirements. Administrators can generally select "lights-out management" and IP-KVM (sometimes called iKVM). LOM incorporates a hardware chip that has proprietary server software on it that will allow you to administer a server. On the other hand, a standalone IP-KVM is a secondary product that can administer numerous servers in a rack or server environment. Both have their pros and cons and both can administer servers remotely. For example, Avocent makes an IP-KVM, which allows the administrator to plug in a mouse, keyboard, monitor and even a serial port into the machine. This benefits the admin, because now he can resolve issues with Cisco Firewalls, network switches and other devices that don't have any LOM tools built in. You can even administer a PBX over a serial connection on an IP-KVM device.
David Langlands, the Chief Security Architect at Cyberklix Inc., an international security and IT consulting firm, is required to administer and oversee numerous servers located in various parts of the world. "When you have a remote environment, a colo for example, I feel that an IP-KVM is not an option, but a requirement." However, standalone IP-KVMs have their issues as well. There are downsides to using secondary IP-KVMs, as Langlands points out. "They are often expensive and there is little to no visibility into the physical parts of a server. Sadly, you can't power cycle with a simple IP-KVM."
But whether using a standalone IP-KVM or LOM, you will probably run into some problems trying to fully diagnose a physical problem. These are great tools for remote management and quick fixes, but if an internal fan is malfunctioning or something is wrong with another component within a server, the remote management software may not always show it. Since LOM is not all encompassing, you must have a secondary option ready to troubleshoot severe hardware failures.
This means that when an issue occurs that remote management tools won't see or fix, you'll need trained eyes on the ground ready to resolve the issue. Some organizations may post an administrator at the remote location, rely on the IT staff of a remote hosting facility or engage the services of a contract engineer to provide on-call or as-needed service. Normally, a hosted colocation will have a service-level agreement that states if a server has a hardware failure, on-site staff can replace that piece of hardware almost immediately. Since these locations operate 24/7/365, you won't run into a problem waiting for your servers to be repaired. On the down side, hosted colocation contracts and professional services get pricey, so be cautious. The price isn't much better for contract engineering services. For example, if you're in Chicago and have servers at a remote colocation in New York, you will be relying on external network engineers to help resolve the issue. These contracts can add up, and having a consultant look at your server will cost around $150 an hour, if not more.
Consider limitations in the remote management tools
Remember, remote management tools are not end-all tools that will save you from complete disaster. If anything, they are useful supplementary utilities that you can use in an emergency. No program or piece of hardware is flawless. So, when working with remote management tools such as LOM hardware and software, there are few things to keep in the back of your head.
- Lights-out management usually comes as embedded software on a hardware chip within the server; you may therefore find the GUI to be lacking in some features.
- Due to limited resources, this onboard software may require very specific versions of Java/Active-X or even a specific browser to work.
- If your server requires high resolution, you may run into some problems accessing the server, as some remote management tools don't fully support high-res accessibility.
- Be cautious of using the "function" key combinations (function+f2, function+ctrl+f12) on your keyboard. When using LOM, you run the very real risk of having your software not understand what you're trying to type!
Planning and preparation can forestall remote management limitations
So when does hands-off really mean hands-on? Usually, the answer will be in the planning that you do for your network infrastructure. As an IT administrator, there must be a happy medium between your remote servers and how much hands-on time they receive. When doing a disaster analysis for your company, you must understand just how important your business-critical equipment really is. Ask yourself this question: "How critical is this piece of equipment and how long can I be without it?" Having someone experienced on the other end ready to fix this server is a must. An IP-KVM or high-end lights-out management utilities will only go so far. Business continuity plays a big role in understanding when "hands-off" will sometimes mean "hands-on."
ABOUT THE AUTHOR: Bill Kleyman, MBA, MISM, is an avid technologist with experience in network infrastructure management. His engineering work includes large virtualization deployments as well as business network design and implementation. Currently, he is the Director of Technology at World Wide Fittings Inc., a global manufacturing firm with locations in China, Europe and the United States.
What did you think of this feature? Write to SearchDataCenter.com's Matt Stansberry about your data center concerns at email@example.com.
This was first published in April 2010