WavebreakmediaMicro - Fotolia

Manage Learn to apply best practices and optimize your operations.

Learn efficiency lessons from hyperscale cloud providers

Enterprise data center teams can learn important lessons from hyperscale cloud providers. Rethink redundancy strategies and consider technologies like SDN to increase efficiency.

This article can also be found in the Premium Editorial Download: Modern Infrastructure: Lessons fall from the cloud, changing data center strategies:

The technology that powers Lewis Hamilton's Mercedes engine to victory on the Formula One circuit eventually makes its way into the garages of drivers who won't ever compete in a grand prix. So it is in data centers, where gains in efficiency and automation by hyperscale cloud providers have begun to trickle down to traditional data centers.

Everything from the latest cooling technology to automated provisioning is up for grabs to boost efficiency and lower costs in data centers of ordinary companies.

"You are getting a drafting effect, just like you are in a race," said Chris Yetman, chief operating officer at Vantage Data Centers, and former vice president of infrastructure operations at Amazon Web Services (AWS). "The big guys are racing forward, and everyone is squeezing everything they can get from them."

Data centers at the back of the pack will be run by IT leaders who are stuck in their old ways, without opening their eyes and ears to the lessons from hyperscale cloud providers. Many businesses today make crucial decisions with efficiency in mind, said Todd Traver, vice president of IT optimization and strategy at the Uptime Institute.

"The biggest benefits are coming from the organizations where the leadership has taken a strong position and has put mechanisms in place to track utilization and targets," he said.

Rethinking redundancy

Until about four years ago, most businesses relied on 2N infrastructure, a redundancy strategy in which data centers contained twice the number of each infrastructure component needed for baseline operations. If a company needed 10 servers for normal operations, for example, a 2N infrastructure would call for 20 servers. Now, there's wider acceptance of a blended architecture because there is more diversity in the applications and less dependency on the physical infrastructure.

More organizations are happy with N+1 redundancy -- an approach in which a company maintains just one spare infrastructure component greater than what's needed for normal operations.

At Digital Realty Trust, a San Francisco-based colocation provider, customers have lessened their reliance on 2N redundancy architecture over the past few years, said Danny Lane, the company's senior vice president of global operations.

Virtualization technology and the application resiliency inherent in cloud architecture have helped Digital Realty customers reduce their hardware footprints by about 20%, Lane said.

The biggest benefits are coming from the organizations where the leadership has taken a strong position and has put mechanisms in place to track utilization and targets.
Todd Travervice president of IT optimization and strategy, Uptime Institute

Nevertheless, only 9% of IT leaders believe their data center is optimized, according to a recent survey by IDC, commissioned by Datalink, a data center design and management provider in Eden Prairie, Minn. No doubt the answer would be different if asked of hyperscale cloud providers, such as AWS, Microsoft or Google.

"That tells you they don't feel like they are operating like the AWS cloud at this point," said Kent Christensen, practice director for virtualization and cloud at Datalink. "They are evolving, but they are not evolving fast enough."

One place to start would be to reassess high availability (HA) and reliability, availability and serviceability (RAS) features. Businesses need to break away from the idea that redundancy and resiliency must be built into every piece of hardware to prevent failure, said Jyeh Gan, director of the Dell EMC's extreme scale infrastructure unit. Instead, organizations need to abstract the software from hardware so it can run on anything and then adopt resilient applications designed to survive hardware failures. That will make it possible to go without the HA and RAS features, he said.

"Most people aren't there yet, and they won't be for years," Gan said. "Even the hyperscalers aren't completely there."

Often, when one company races ahead of a competitor, the company that is left behind will scramble to modernize. That's often when they'll apply some of the efficiency lessons learned by hyperscale data center operators, Gan said.

But the transition cannot be as sudden or simple as shifting a gear and stepping on the gas. Instead, it needs to be accomplished in stages, he said, pointing to his work with companies that have gradually removed HA and RAS systems management features. Companies with a suite of software designed to deploy, manage and monitor servers shouldn't eliminate the entire suite to start, he said. They should instead move to the Redfish environment -- a standard RESTful API to manage servers -- as an initial step.

It is easy to understand the drive to stay competitive, Gan said, but it is hard to go through the process smoothly and methodically without jamming a data center operator with so many unfamiliar concepts and technologies.

Trickle-down 'cloudonomics'

The characteristics of large public cloud data centers have started to show up more frequently in on-premises products used by more typical organizations. Hyper-converged infrastructure, for example, was modeled after the architectures of hyperscale cloud providers, said Kuba Stolarski, a research director at analyst firm IDC.

"That's really taking the Google, Facebook, etc., model to figure out how to do virtualized storage more efficiently," he said.

Another advancement that has begun to appear in some data centers is software-defined networking, Vantage's Yetman said.

"What a large cloud provider would do -- like an AWS or a Microsoft and others -- is look for ways to cut out the higher cost overhead they have," Yetman said.

That led to the design and development of low-cost switching. Enterprises can replicate what is offered by a traditional vendor and avoid spending a few thousand dollars per switch in a rack and instead buy something for $800 that does the job just as well, he said.

Facebook, Azure and AWS all use standard hardware to build their own versions of routers. Some large companies with custom-built infrastructures, such as Facebook and LinkedIn, have shared their designs. "Everyone can benefit from that and build a network at a lower cost, and it will still be reasonably supportable," Yetman said.

Hyperscale cloud providers' methodical management of their data centers sets them apart from most organizations, said Uptime's Traver, whose experience also includes more than 20 years at IBM working on various data center design and efficiency projects.

Hyperscale data center operators know how to react in any given situation, while many businesses benefit from smart people who simply know how to do things, which amounts to "tribal knowledge."

A typical business, for example, may rely on employees to regularly talk with each other to run a data center. Conversely, a hyperscale data center operator may have hundreds of people at data center locations around the world. To consolidate that distributed knowledge, hyperscale operators typically maintain specific run books with solid, documented methodologies.

"Enterprises have been historically a little free and easy," Traver said. "The webscale properties had too much at risk."

Automate for efficiency

The efficiency of hyperscale operators, in large part, comes from the automation of manual processes and the use of homogenous servers.

Businesses have started to reduce the various types of servers and virtual machines installed in data centers, Traver said. With less variation, data center operators can better manage the load. Efficient organizations group servers together with an orchestration layer that manages all of the servers as a whole.

To reach peak utilization, data center operators need to predict actual rack loads, which is difficult for most large companies, said Jakob Carnemark, CEO of Aligned Data Centers.

The density for hyperscale data centers typically averages 15 kilowatts per rack range, which is about five times the density of most data centers built today, he said. Organizations need to predict data center density in order to manage infrastructure efficiency.

"It is nothing that anyone besides the extreme hyperscale players has been able to do," Carnemark said.

Vendors that sell data center products have taken notice of the strategies used by hyperscale cloud providers and should soon develop management tools accessible to more typical customers, Yetman said.

For example, Google has started to use artificial intelligence (AI) to manage cooling in its data centers. That resulted in a 10% annual savings on cooling. Any business would welcome a 10% cut in cooling costs. For Google, that meant a $100 million savings.

"The DCIM providers, if they are smart, will look to see how they can replicate that success and pass on the efficiencies to their customers, who tend to be enterprises," Yetman said.

While AI is too complicated for many companies to tackle alone, at least one or two vendors will soon replicate what Google has done to help organizations manage the data center environments in a similar way.

Besides falling short of the scale of hyperscalers, there's not much else that would keep an organization's data center from achieving the efficiency of a hyperscale data center.

A business that recognizes the need to put in place cloud-like efficiencies often attempts to get the entire data center team on board, Christensen said. If there is resistance, he has seen companies bring in another team to do it.

"That team is going to come in with such a fresh, new set of ideas and try to change the universe as the other team operates, but becomes less valuable over time," he said. "Things are changing fast, and people need to adjust."

Next Steps

Use new metrics to measure data center efficiency

Tips to increase data center efficiency through zoning, cabling

Explore the impact of the hyperscale market

This was last published in May 2017

Dig Deeper on Data center capacity planning

PRO+

Content

Find more PRO+ content and other member only offers, here.

Join the conversation

1 comment

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

Which efficiency lessons have you learned from hyperscale cloud vendors?
Cancel

-ADS BY GOOGLE

SearchWindowsServer

SearchEnterpriseLinux

SearchServerVirtualization

SearchCloudComputing

Close