Manage Learn to apply best practices and optimize your operations.

Why a disaster recovery plan is a core part of compliance

Compliance cares whether or not the organization can provide accountability and assurance that they are protecting the information's and processes' confidentiality, integrity, and availability. Hence, the connection.

Why is having a disaster recovery plan such a core part of regulatory compliance? Isn't a backup data center enough?

Actually, the regulations don't so much state a disaster recovery plan (DRP) as they state having a business "continuity" or "contingency" plan (BCP). That's a big difference, so let's start there.

The difference between a DRP and a BCP is when you've decided to protect the horse. A disaster recovery plan focuses on what to do after the barn has burned down. The business continuity or contingency plan focuses on what to do when you smell smoke.

Why is this a core part of compliance? Compliance cares whether or not the organization can provide accountability and assurance that they are protecting the information's and processes' confidentiality, integrity, and availability. Which all have to balance as shown in the diagram below:


Assured security means a balancs of confidentiality, integrity, availability, and accountability.

Because compliance is about availability and not recovery, it is about continuity of operations and having contingency plans to ensure continuous operations. Hence, the plans have to start when you smell smoke, and not after the fire truck has left the scene.

In order to be able to react correctly when you smell smoke, you have to have a plan that begins with assessment, moves through reaction options, direction determination, and several action plans that support divergences. Let's take this step by step with two different viewpoints - one is a restaurant chain data center that houses the chain's food ordering system and POS rollup system. The second is a bank's data center that also supports credit card clearing.

Both organizations would start their plan with the ability to assess their inherent risks and the organizational tolerance to downtime. The bank, because it is a processing clearinghouse, has a matter of minutes for allowable downtime. The restaurant hub has to worry about each day's POS rollup as well as each day's consolidated orders from the various individual chain managers - with a low tolerance for downtime during ordering and POS rollup hours (7 pm to 11 pm each night), but other than that, it could be down for several hours at a time. Because of the risk assessment (which is recalibrated each time a new process or business system is added), and the way that Sungard contracts its sites (first come first served, too many folks and someone has to go elsewhere) the bank had to invest in building out its own data center and the restaurant had the allowable tolerance to downtime to use Sungard.

Reaction options must be planned in two ways: what to do when changes to the business or systems occur (that is, how to react), and what to do when the organization senses danger. There isn't enough room in this article to wade through all of the facilities reaction steps to threats, but the methodology for both the bank and the restaurant are the same. When a change in systems or organization occurs, find the root cause and recalibrate risks and decisions. When a change to the primary facility is likely to occur (possible loss of power, possible physical hazard, possible technological hazard) assess whether to immediately switch to the secondary site, servers, and processes, or continue monitoring the situation.

If the decision is to make the switch, the plan then has to cover direction determination (move all systems, move some systems), action plans for the direction, and divergence plans in case the direction is no longer feasible. Since the bank is building out its own data center, it can "go lite" on divergence because they have control over their destiny. But because the restaurant is working with Sungard, they do have to plan for there being no space at the original Sungard site and having to go elsewhere.

You also need to think of your plan as a set of procedures to make this all work. If you think of the plan as "a book," you'll end up putting the book on a shelf and never updating it. If you look at it as a set of individual procedures (like Liebert does with their data center availability assessment service), those procedures then become the ever important control objectives that the compliance auditors are looking for.

For compliance, it is about maintaining confidentiality, integrity, availability, and accountability. Having a systematic and comprehensive set of procedures and processes that use those procedures ensures that you have control and accountability. The thoroughness of the procedures ensures that if you smell smoke, you can make the decision to move to a secondary site and maintain the same levels of confidentiality and integrity while maintaining availability using temporary facilities.

Dig Deeper on Enterprise data storage strategies