What are some common mistakes made when using IPMI? The biggest mistake one can make is to not use it [because]...
IPMI is a pre-integrated management feature in servers. In fact, the Aberdeen Group -- an industry research firm -- stated in an August 2003 white paper that IPMI has become a checklist requirement for IT [when] evaluating infrastructure needs. So, what's the end result of IPMI's work in this situation? In summary, IPMI helps minimize overall downtime and costs during the server life cycle by speeding the initial provisioning process, eliminating the need to physically visit the server to diagnose and rectify problems, to have some kind of predictive failure notification prior to system outages and to perform more intelligent diagnostics before dispatching personnel. In this case, what would have happened without IPMI? Without an IPMI-enabled system, the administrator would have to visit the remote location to reboot the hung server, resulting in extra time and costs. With IPMI, the server can be remotely rebooted with a simple standardized IPMI reset command. Additionally, by viewing the IPMI event log, failing components are more readily identified. When IPMI identifies whether a problem arises from software versus a hardware issue, it arms administrators with more knowledge to improve [their] diagnosis before having to dispatch service personnel.
Using additional features like console redirection and virtual media, IT can interact and view the boot process [or] use files from a remote console to run diagnostics or flash firmware. IPMI 1.5 offers multi-layered passwords and MD2 or MD5 signatures for additional security to ensure only known administrators have access. What happens if there is a failure?
Let's assume that a remote server OS hangs. Without IPMI, the administrator cannot reconnect to the server agents or console from a remote location. With IPMI, the administrator can check the server's operational status -- temperature, fans, voltage, drive errors and other components -- by retrieving the server event log stored on the BMC. He finds no other events that may have indicated specific hardware problems that would have led to a hung OS. A simple reboot appears to be the remedy. What happens when the server is in production?
As an example, consider the process of taking 'bare metal' servers into production. Many management techniques and software that do this require some knowledge of the server's hardware in order to provision them with the correct OS, drivers, applications and network settings. Typically, PXE boots and multiple OS image downloads are required. Sometimes not all the information can be captured. With IPMI, the server can describe itself -- meaning that, with IPMI utilities, a more immediate and accurate server hardware asset template can be developed. And, with a better idea of the target server's configuration, more intelligent decisions can be made. Overall, using IPMI can speed the provisioning process of taking servers into production. How is IPMI usually deployed today?
IPMI is implemented as firmware running on a dedicated controller chip -- also known as a baseboard management controller, or BMC. Typically, the BMC is on the system motherboard, or blade. A BMC is used with IPMI firmware to create the basis for a standalone platform management subsystem -- independent of processors, the BIOS and the OS. This is the key concept to grasp, as it allows for manageability, monitoring and recovery during states where the system processor, BIOS or even the OS is not available.
These characteristics remove limitations encountered with all OS-dependent management agents, such as when the OS is hung up. However, IPMI is not a replacement for existing management technologies; it's a complementary technology that improves system health. Finally, by implementing IPMI manageability as a separate subsystem within a platform, management software is abstracted from platform hardware management, meaning the same software can manage other IPMI systems, so long as they have IPMI-compliant firmware.