BACKGROUND IMAGE: iSTOCK/GETTY IMAGES
Disaster recovery should only be seen as the ultimate safety net. For every second a company's IT platform is down, it loses revenue. Business continuity should instead be the aim, since it enables an organization to work through a failure in any part of the IT platform, rather than just minimize downtime.
When it comes to building an IT business continuity plan, there are different approaches teams should take for colocation vs. cloud platforms.
The business continuity challenge with colocation
The problem with colocation is that the organization still owns all the hardware that is placed within the facility. As you would with an on-premises platform within your own data center, expect an "N+M" redundancy strategy: for every N pieces of equipment, there should be M amount of redundancy to deal with the failure of any single item. However, this means that an organization pays for a lot of equipment that is there just in case. This involves capital costs along with licenses, maintenance, power and space. What's more, that only provides for low-level equipment failure, where business continuity is maintained by switching workloads over from one piece of equipment to another.
For site-level business continuity, organizations have to pay for a mirrored site in a different facility, and face the challenge of maintaining long-distance synchronicity. As such, building a full IT business continuity plan for colocation can still be very expensive.
With cloud, pay attention to your SLA
Since you don't own the hardware in cloud, it is the service-level agreement (SLA) that becomes king. Within that SLA, determine which levels of business continuity the cloud provider commits to.
Since a cloud platform is shared among multiple users, a well-architected cloud platform will already cover any single-item equipment failures -- it follows the N+M model to a large extent. However, there have been far too many failures in the cloud where the provider cuts corners and only implements an N strategy in certain areas, such as fiber channel controllers in storage area networks, or in wide area network connections.
When you speak with providers, and build an IT business continuity plan with cloud, make sure all aspects of the platform are covered by suitable redundancy within the facility itself -- including cooling, uninterruptable power supply and other auxiliary power supply systems.
Cold vs. warm images
Also speak with cloud providers to ensure you have a long-distance IT business continuity plan. A cost-effective approach is to hold images of the required applications that are ready to take over should the primary site go down. For high priority workloads, use warm images that are already provisioned and running at a remote site, along with mirrored data stores. Use cold images -- those that need spinning up into a live, operational state -- for less important workloads.
Warm images are still comparatively expensive; they are essentially a full mirror of what is going on at the primary site. Due to the elasticity of cloud resources, they can be held with little excess resources in place, taking what is needed as the workload switches from the primary site to the backup site.
Cold images are cheap to operate. They are a low storage cost and should include an agreed-upon SLA that specifies how rapidly they can be spun up, with sufficient resources, to get things back up and working again.
Although this cold image approach is more of a backup and restore strategy, modern cloud platforms can spin up images in little time as long as the data is being mirrored synchronously. This is far better than trying to recover everything from data backups, where the real problem is to minimize the time between the recovery point objective and the recovery time objective.
When building an IT business continuity plan, a cloud platform can offer a far more flexible and cost-effective approach than colocation. However, many colocation providers partner with cloud providers to offer a hybrid possibility.
Use integrated system testing to avoid colocation downtime
Spot risky providers in cloud hosting and colocation
What to do when a DR plan goes up in smoke