The Intelligent Platform Management Interface (IPMI) is everywhere. In fact, this year will be the 10th anniversary of the 1.0 Specification. If you've purchased new systems in the last four years -- and we know you have -- there it is. But are you using IPMI? Do you know how? Don't worry, SearchDataCenter.com has you covered.
Just like cars, we've seen systems components include more and more sensor capability over the past 15 years. Long-time Linux users will remember the buzz around lm-sensors, which allowed you to put spiffy thermal sensor data on your desktop. In current-generation systems there is a truly remarkable wealth of information that goes largely untapped.
Previously, getting at that data was very complex because of a variety of sensor buses and interfaces. Two great improvements came together to resolve the complexity. First, system boards came to include a Baseboard Management Controller (BMC), which acts as a hub for the various sensors throughout a system. Second, IPMI became the standardized way to interact with the BMC.
Almost every modern server motherboard has a BMC built into the mainboard. It's a common misconception that you need a Service Processor (SP) to collect or report sensor data. SPs, such as Dell's Remote Access Card (DRAC) or Sun's Lights Out Manager (LOM), are add-on boards that add value to the BMC's capabilities, such as providing a Web interface, SSH interface, SNMP, or networking, but by no means a requirement for utilizing IPMI.
In fact, some systems, such as Dell, allow the BMC to share a network interface with the system, allowing network IPMI capability without an SP.
The most common tool for interacting with IPMI is the open source IPMItool. While most users' experiences with it have been constrained to power cycling hung servers, remotely it's capable of much more. Let's review its basic use and then dig deeper.
The most basic use of IPMI is power management. If your OS offers a BMC driver (commonly /dev/bmc) you can interface with the BMC via IPMI locally, like so:
$ ipmitool -I bmc chassis power status
Chassis Power is on
When accessing it remotely, you'll need to use the "lan" (IPMI 1.5) or "lanplus" (IPMIv2.0) interface.
$ ipmitool -I lanplus -H 10.0.50.143 -U root chassis power status
Chassis Power is on
Now that we can use IPMI, let's look at three powerful capabilities of IPMI that we can use IPMItool to access.
The first is sensor data from the IPMI sensor data repository (SDR). The current values and thresholds from every sensor on the system are contained within the SDR. This commonly includes voltage sensors, temperature sensors, fan speeds and fault sensors.,/p>
$ ipmitool -I bmc sdr
CPU 0 Temp | 38 degrees C | ok
CPU 1 Temp | 43 degrees C | ok
Ambient Temp0 | 19 degrees C | ok
Fanbd1/FM1 | 11704 RPM | ok
Fanbd1/FM0 | 11935 RPM | ok
Fanbd1/FM3 | 11473 RPM | ok
Power Supply 0 | 0x00 | ok
Power Supply 1 | Not Readable | ns
The three columns here are the name of the sensor, the current value, and the threshold indicator. Notice, for instance, that Power Supply 1 is "not readable." This because I don't have redundant power for that server and removed the redundant power supply.
The second is event data logging from the IPMI System Event Log (SEL). The SEL is responsible for logging significant events; a lot of the system warning lights (such as the Dell Orange LCD of warning) are set based on this log. In the case of many systems, you can disable all those warning LEDs by simply clearing the SEL. Lets look at some data.
$ ipmitool -I bmc sel elist
1 | 12/21/2007 | 00:15:24 | Fan Fanbd1/FM4 | Upper Critical going high | Reading 14938 > Threshold 14938 RPM
2 | 12/21/2007 | 00:24:31 | Fan Fanbd1/FM4 | Upper Critical going low | Reading 14784 < Threshold 14938 RPM
43 | 12/21/2007 | 01:26:55 | Fan Fanbd0/FM4 | Upper Critical going low | Reading 14322 < Threshold 14938 RPM
44 | 12/21/2007 | 01:26:56 | Fan Fanbd0/FM7 | Upper Critical going low | Reading 14245 < Threshold 14938 RPM
45 | 12/21/2007 | 01:26:56 | Fan Fanbd0/FM6 | Upper Critical going low | Reading 14245 < Threshold 14938 RPM
46 | 12/21/2007 | 01:28:01 | Fan Fanbd1/FM4 | Upper Critical going low | Reading 14168 < Threshold 14938 RPM
In this case the errors all relate to cooling fans exceeding upper and lower thresholds. We can clear the log like so:,/p>
$ ipmitool -I bmc sel clear
Clearing SEL. Please allow a few seconds to erase.
If your server includes an "error log" via a web interface or SNMP, you are probably actually seeing the IPMI SEL.
The final useful feature of IPMI we want to look at is Serial over LAN (SoL). IPMI SoL allows you to do full serial console redirection via IPMI over a network just as though you were using a consoler server. Example:
$ ipmitool -I lanplus -H 10.0.50.143 -U root sol activate
[SOL Session operational. Use ~? for help]
ev2-dev2950-01.joyent.us console login:
While I do not recommend replacing your existing console access method with IPMI, there are a variety of cases in which it can be extremely helpful, especially if you are on a tight budget. If you've used a DRAC's "connect com2" command, you were in fact using IPMI SoL.
We've just scratched the surface of what is possible and available via IPMI, but I hope you can see that we commonly take this wealth of information for granted.
Equally important is that most management interfaces available from vendors are wrapping IPMI functionality, which means that when you choose to customize your monitoring solution, for instance, to check individual fan speed or disk status, you can probably avoid vendor tools and use IPMI directly.
I hope this short tutorial has given you a little incentive and courage to dig in and find out just how much your servers really know about what you're doing. May your servers be happy and your sleep uninterrupted!
ABOUT THE AUTHOR: Ben Rockwood is a Unix guru and blogger at CuddleTech.