WavebreakmediaMicro - Fotolia
Architectures that span distributed data centers can reduce the risk of outages, but enterprises still must take necessary steps to ensure IT resiliency.
Major data center outages continue to affect organizations and users worldwide, most recently and prominently at Verizon, Amazon Web Services, Delta and United Airlines. Whether it's an airline or cloud provider that suffers a technical breakdown, its bottom line and reputation can suffer.
Upon close examination, most data center outage causes are complicated, said Andy Lawrence, VP of research at 451 Research, in a recent webinar on data center resiliency.
Human error remains high on the list of triggers, and while automation reduces that risk, it won't completely eliminate it. IT staff should evaluate distributed data centers to achieve multisite resiliency, maintain uptime and minimize risk.
Resiliency matters when it comes to data center protection
There are several options to achieve resiliency across distributed data centers, whether for an organization's own facilities or those of a colocation or cloud provider.
There are three main types of multisite data center resiliency, according to the Uptime Institute:
- Linked-site resiliency: Two or more lower-tier data centers within a particular campus or region tightly connect to yield a higher degree of resiliency compared to a traditional, single-site setup.
- Distributed-site resiliency: Multiple, independent data centers use shared networks to deliver resiliency through multiple instances that are asynchronously connected.
- Cloud-based resiliency: Tightly linked facilities use high levels of bandwidth, fiber and two-phase commits across several data centers. This model usually involves virtualized applications, cloud instances and containers.
What exactly is data center resiliency?
Resiliency is the ability of a server, network, storage system or an entire data center to recover quickly and continue operating even when there has been an equipment failure, power outage or other disruption.
Multisite resiliency challenges
While these distributed architectures boost resiliency compared to traditional, single-site models, there are some potential tradeoffs.
For example, admins must architect a network that is self-healing and responsive, with well-orchestrated failover and sufficient bandwidth, to maximize resiliency.
In addition, admins must ensure data integrity, with data synchronized and up-to-date as it replicates across sites, said Todd Traver, VP of IT optimization and strategy at the Uptime Institute, during the webinar. The ability to quickly move large amounts of data without packet loss is especially important to ensure database and overall data integrity in these distributed architectures.
"Now, you have different portions of the applications spread across the enterprise, colocation and cloud," he said. "How do they all interact together? How do they fail gracefully? How do you maintain the data integrity?"
Organizations should create an "assurance construct" to address those questions and show CIOs and CTOs how data traverses the network and how failover works, Traver said. That way, the entire business understands the level of resiliency that its infrastructure delivers.
At this point, only major public cloud players, such as Google Cloud Platform, have the resources to establish true cloud-based resiliency with complete consistency across all data centers in the network, Lawrence said.
"It's probably not something that enterprises will be able to aspire to -- perhaps not at least for the next decade and a half -- but perhaps when they have enough sites and different colos, it might be possible," he said.
Ultimately, the kind of resiliency an organization pursues should depend on its applications.
"You need to understand the requirements [and] … the economics of your applications before you start thinking about which architectures you're going to adopt," Lawrence said. "And our view is you're probably, in most cases, going to end up adopting multiple of these. It's just a question about how you'll go about that."
Data center outage costs rise by the minute
Ensure your data center resiliency plan can withstand a lot
Tier certification changes for data center design and architecture