In the months before Hurricane Katrina struck in 2005, the IT department for the Louisiana Supreme Court drafted a disaster recovery plan, including plans to create a hot site, to keep all systems running continuously in the event of a disaster. By August 2008, when Hurricane Gustav struck the region, that plan proved invaluable.
But in the meantime, Katrina hit, and the court shut down for two months, delaying the department's disaster recovery (DR) plans and putting justice on hold. "Katrina beat us to the punch," said Peter Haas, the court's technology director. "It essentially shut us down, and we went through some extraordinary things to get the court running again in Baton Rouge. It drove the point home that we really, really needed something."
When the court was up and running again, Haas and his team used their firsthand experience with Katrina to build a bullet-proof DR plan that saw them through Hurricane Gustav without a hitch. Among the Katrina-inspired improvements: a hot backup DR site in northern Louisiana, cross-training staff and giving IT staff a thumb drive carrying crucial information to hang around their necks.Katrina brings DR vulnerabilities to the fore
Haas and his crew also ferreted out all single points of failure. And they found several.
"If you do it honestly, it's a frightening picture," Haas said.
The court discovered several issues, including the following:
- A week's worth of fuel supply for backup generators wasn't enough. The new DR plans calls for 14 days of diesel and the ability to switch over to natural gas.
- Too much reliance on vendors with subpar backup processes. Some court employees have BlackBerrys, and during Katrina, the court's mobile phone carrier "fell flat on their face." The company changed providers, although Hass wouldn't name names.
- An overly specialized staff. "During Katrina we scattered," said Has. "There were people all over the country. So we've had to cross-train everybody so they could speak intelligently about other disciplines."
- Overdependence on tape-based backups. Because IT employees might not be able to access tapes in the event of an emergency, employees now wear key IT architectural data in thumb drives that they wear around their necks. The court also built a hot site in Baton Rouge and implemented better high availability, backup and recovery processes.
For its hot site, the court first explored some commercial solutions such as SunGard and other hot-site services, but Haas said these providers' technologies involved some sticker shock. Instead, the court commissioned an existing state office in northern Louisiana and turned it into its disaster recovery site. That proved beneficial. For one, because it was a state building, there were no purchase costs. At the same time, it was far enough away from the coast to be safe but close enough to New Orleans to be able to get there in four hours.
Ultimately the court set up a primary data center with about 40 servers – 31 of them in production – and a hot site with 14 servers backing up applications such as email and major database services. The court is also at work on backing up its BlackBerry application at its hot site.
Applications are backed up to the DR site using data replication software from CA, one of four products that the court evaluated. Haas said that while all the candidates were good, CA's XOsoft "was very intuitive and a complete package. Others were piecemeal. They didn't have a clean install and clean flow to them, and there was more than one component."
The court implemented its disaster recovery strategies over the course of six months, and finished in May 2006. The court now tests every year before hurricane season. Most recently, the court ran a drill four weeks before Hurricane Gustav.Girding for Gustav
With memories of the Katrina disaster still fresh, officials prepared for Hurricane Gustav with care, ordering massive evacuations. And while the storm was not as calamitous as officials feared, it still caused billions of dollars worth of damage, left more than a million people in the state without power and was blamed for about 43 deaths in Louisiana.
Haas said that as the storm approached, he wasn't that concerned about the systems' ability to stay up, because they had done regular, extensive testing. But he was worried that network bandwidth could be problematic, with everyone trying to backup data to a remote location. So he made a decision to fail over email services to the court's hot site.
"I had the systems manager on the phone, and we were doing it from our laptops using Verizon cards," he said. "We were in the middle of evacuating when we did it."
The court's website is already hosted at the DR site, so IT didn't have to worry about failing that over.
The bottom line: IT kept communications online throughout the storm. Nothing went down. And when the court reopened four days later, all applications ran as smoothly as they had before the storm. Thanks to careful DR planning, the Louisiana Supreme Court beat Gustav to the punch.