Thomas Roberts lived through a nightmare caused by his data center's emergency power-off (EPO) switch, and it happened
on Easter Sunday.
About a month after opening a new facility in March 2003, Roberts, the director of data center services for Novi, Mich.-based Trinity Health, got a call. It was Easter morning, and a contractor had accidentally activated the EPO switch as he tried to replace a module connecting the button to the fire alarm system. According to Roberts, the fiasco "took the data center out."
Trinity Health is the fourth largest Catholic health system in the country, with more than 45,000 employees, almost 400 outpatient clinics and facilities, and $6 billion in annual revenue. So when a data center goes down, it's a big deal.
Fortunately, the health care center didn't have any major clinical systems online in the 12,000-square-foot data center because it was so new. Also, since it's a Catholic organization, Trinity had tried to discharge as many patients as possible so they could spend the holiday with family and friends rather than in the hospital.
"We went out at 8:30 that morning," he said. "By 11:30 that night, we were probably 95% up and going, so we were pretty lucky. But from that day forward, I tried to lessen the effect of this EPO."
It is one of those unspoken standards in the data center: the EPO: that big, red pushbutton that usually sits by the exit. Its intended purpose is for firefighters to be able to shut down power quickly during a fire in the data center to avoid getting an electrical shock from the IT equipment. Data center managers can also use the button to quickly shut down power from a distance or in the event that someone is being electrocuted. The EPO is connected to the uninterruptible power supply (UPS) circuits that power the data center; when it's hit, it's like unplugging the entire data center from the wall.
Richard Sawyerprincipal, EYP Mission Critical Facilities
The hope of data center managers is that they'll never need to use an EPO button. The reality is that there's a good chance it will be set off accidentally or as an act of sabotage.
"There's a very hard-core group of public safety officials who are adamant that safety comes before all else, especially when their personnel are involved, which is very understandable," said Richard Sawyer, a principal for Albany, N.Y.-based EYP Mission Critical Facilities, a consulting engineering firm specialized in designing critical facilities such as data centers. "However, in the data center realm they're rather myopic about it. The EPO can actually put more people at risk."
With Trinity, Sawyer would argue that having its data center shut down unnecessarily could put thousands of patients at risk, which is more dangerous than not being able to shut down a data center immediately in the event of a fire.
Accidents with and sabotage of the EPO are common. Sawyer has encountered some EPO designs that are practically begging to be set off accidentally. In one case, the EPO sat on a wall behind a copy machine; when the copier top was opened, it could easily hit the button. In another case, a service technician bumped against the EPO while unpacking equipment. And earlier this year, there was the disgruntled technician who nearly disrupted the power grid in the western United States when he hit the EPO.
"I have yet to meet a data center manager who advocates for an EPO or wants an EPO," Sawyer said.
Is NFPA 75 mandatory?
The emergency power-off button has its roots in a 1959 fire at the U.S. Air Force's statistical division at the Pentagon, which caused $6.7 million in damages, destroying three IBM mainframes and between 5,000 and 7,000 reels of magnetic tape.
Three years later, the National Fire Protection Association drafted its first Standard for the Protection of Electronic Computer Systems, a standard that has come to be known as NFPA 75 and introduced the EPO switch. But as Sawyer is eager to point out, NFPA 75 is just a recommended standard. NFPA 70, a mandatory code developed by the NFPA, was formed six years later – and mandates the EPO button only in certain situations.
Nonetheless, many state and local governments have adopted NFPA 75 and tend to impose it on data centers in their jurisdiction. As a result, most data center managers have come to think of NFPA 75 as nonnegotiable.
But Sawyer disagrees. The bottom line, he said, is that NFPA 70 should be followed, but not necessarily NFPA 75; accordingly, EPO buttons are not a requirement for all data centers, according to Sawyer.
Sawyer says that a data center must meet three criteria to be exempt from an EPO button requirement: the absence of cables underneath a raised floor, IT equipment cable boxes secured to the floor, and not disclosing that you're following NFPA 75. He added that even if you do install an EPO, it doesn't have to be a big, red pushbutton. Sawyer suggests installing a rotary switch instead of a button because it's harder to set off accidentally. He also said there should be a plastic cover over the EPO that, when lifted, causes bright lights and loud horns to go off as a warning.
Meanwhile, Sawyer and others are advocating adjustments to the mandatory NFPA 70 code. One thing they'd like implemented is "zone EPOs" so you can have separate EPOs for different parts of the data center facility.
Dealing with the AHJ
Even if you've decided you don't need an EPO, dealing with the authority having jurisdiction (AHJ) – typically a fire marshal or inspector – is a whole other story. Just because you're right doesn't mean you'll get what you want.
"If you designate a room as a computer space, you can't get the certificate of occupancy until the AHJ has approved the design and installation and has tested it," Sawyer said. "Someone who has built a multimillion-dollar data center and can't get it approved can't get it online in time."
The best plan of attack is therefore compromise, not confrontation. It's the way that Roberts approached his local fire marshal. After researching the subject thoroughly, he talked to the fire marshal, who didn't grant Roberts permission to get rid of his EPO altogether but authorized the elimination of the fire system activation of the EPO. That way, if the fire system detects a fire and the halon and sprinkler systems activate, the EPO still won't go off. But if firefighters arrive on the scene and want to hit the button, they can.
Roberts isn't totally satisfied, though. He doesn't want the EPO to be accidentally discharged. To that end, he's asked the fire marshal if Trinity can put a time delay on the EPO, such that if the button were pushed, the data center manager has time to examine the premises and rule out whether it was pressed accidentally. Roberts would like the delay to be at least three minutes, but it's something he's still negotiating with authorities.
The idea of a delay is something that Lance Harry, the business development manager at Fenwal Protection Systems, isn't fond of. Fenwal designs and installs fire protection systems, and Harry said "the most technically recommended scenario would be to tie in the EPO of the entire facility to the fire suppression control facility, so when the fire suppression agent is about to come out, the data center would shut down."
Harry added that there are already automatic delays inherent in the fire suppression system. The detection system picks up the event, sends a signal to the control panel, and then an alarm is sounded. Harry said the delay between the first alarm and the EPO going off could be as long as 10 minutes depending on what's going on. If there's a major fire, the delay could be much shorter.
"If you had a large event and there's a [manual] delay, you might have a problem," he said. "It's good for the system to interpret it appropriately and give you the automatic delay depending on the level of urgency."
EPO-first data center design
When it comes to the EPO, most data center experts agree on one thing: Plans for inclusion of an EPO need to be part of the initial design of a new data center. Thinking about how the EPO will fit into the facility after the fact is just asking for trouble.
"What generally happens is it doesn't become clear to the local contractor installing the system what the end user will do to tie it in to the EPO," Harry said. "The end user may or may not decide later what to do with the EPO situation. We would encourage them to have that strategy so we can design the appropriate system, whether the design would change based on the EPO strategy."
Sawyer agrees. Given the magnitude of potential impact of an EPO button on data center operations -- and the ease with which it can be misused -- proper planning is a must.
"Design the EPO system first in the data center, not last," he recommended.
The big red EPO button
Data center EPO vulnerability fixed in 2011 National Electrical Code