It's nice to see companies paying more attention to disaster recovery (DR) and associated testing. This trend includes...
not only keeping DR plans current, but also executing recurring testing of those plans.
As part of your overall business continuity plans (BCP), your data center DR plans must accommodate and address both the technical and non-technical aspects of a disaster event. Too often, under real recovery situations, the non-technical elements bring your recovery to a screeching halt. Here are a couple of examples that can be learned from.
E-mail has quickly elevated to a "mission critical" application for most of my clients. If it's down for even an hour, the HelpDesk gets inundated with calls. When a disaster hits, e-mail will be down for some period of time, depending upon your recovery point and recovery time objectives. But what is your "Plan B" if e-mail isn't available quickly? Problems could arise with access to corporate e-mail systems, address books, or historical e-mails. You may want to have alternative e-mail access for your entire recovery team (e.g. gmail, yahoo, or hotmail) should corporate e-mail not be available quickly after a disaster event.
So, take away my e-mail, I'm lost. But take away my cell phone and now I'm really befuddled. Cell phone service outages are a real possibility depending on the type of disaster. It could be due to volume of calls placed or caused by problems at a cellular provider. This could also affect the availability of voicemail. Your DR plan should also have alternatives for voice communication, potentially even making an investment in satellite phones or other options for critical DR staff.
Other logistical issues are also important in the recovery equation. Access to your DR site is at the top of the list. Hopefully, you have some arrangement for staffing at your DR site upon the declaration of a disaster. Those staff members can at least immediately start some of initial recovery steps. But your primary staff is likely critical to the success of your recovery plans, and this is where it can get tricky. Depending on the type of disaster, key staff may not be available for immediate recovery steps. Remote access to the DR site may also be affected, including access to the Internet, WAN links, or VPNs. Getting staff to the physical DR site could be problematic if roads are closed and/or planes aren't flying. Are these situations accommodated in your recovery plans?
Finally, think about all your partners, vendors, consultants, and other "outside resources" that could affect your recovery efforts. Support lines may be jammed, critical resources may be reallocated elsewhere, parts and inventory may not be readily available, and key contractors could decide these recovery activities are outside the scope of their contract.
Obviously, not every potential issue can be fully accommodated in your DR plan, but looking at DR events back over the past several years, there are some real risks and probabilities that your plan should account for. Ensure that your BCP and DR plans take into consideration both the technical and non-technical aspects of a disaster event, and make sure your testing addresses those as well.
ABOUT THE AUTHOR:Bill Peldzus is Vice President of Data Center and Business Continuity & Disaster Recovery Services, GlassHouse Technologies