Buy the right cloud hardware for your IT applications

Cloud applications typically rely on x86 scale-out servers. You can optimize a cloud data center deployment and buy hardware for special use cases.

Data center managers must work with the applications team to purchase cloud hardware for applications. This includes flexible scale-out clouds and demanding specialized use cases.

When moving to a cloud architecture, install rack-based and largely identical servers. Tuning a set of hardware to application use cases is a matter of determining which pieces of standard hardware work best, and how to arrange them. For example, some applications will benefit from a simple re-racking of servers and storage.

The move to commercial, off the shelf data center systems is now mainstream, and the IT department can choose from a spectrum of cloud hardware from various vendors.

Cloud hardware for most use cases

Many IT teams are converting applications to run on a cloud infrastructure to take advantage of flexible deployment and scaling. The baseline cloud server configuration is either a dual-processor 1U server, or a 1U twin server, which, though smaller, allows two servers per rack U of space.

The 1U size limits cooling fan performance, so these servers' CPUs tend to be four- to eight-core devices with low-end power profiles. This works for most cloud deployments, as two six-core CPUs support 96 basic virtual machines (VMs), requiring 96 GB of DRAM. Container virtualization, implemented with Docker or another containerization technology, increases the VM count, but requires less DRAM per instance.

General-purpose cloud servers are stateless, with no attached disk drives; however, a small direct-attached flash drive means faster startups from a locally booted hypervisor. With containerization, the flash drive also contains the OS image and possibly even standard app stacks, such as LAMP.

Replacing much older x86 servers with a current generation reduces the server count for the same job by at least 50%, and often much more. These servers won't have local disk drives, so they run cooler than a traditional model by about 10%. Don't expect to put those savings aside for long -- keep capital free for horizontal expansion or to build capacity for new use cases.

There are an abundance of reference designs by key vendors that target interoperability, design quality and uniformity. The move to commercial off the shelf (COTS) data center systems provides a spectrum of architectures from these vendors. COTS equipment offers well-defined specifications to guarantee interoperability between various pieces of gear. Consider uniformity in the design cores of servers and the storage appliance, as well as cooling and power systems, and the mixture of cable connections supported in any given product family.

Networking for a cloud architecture can rely on a single 10 GbE port per server, but a dual-port configuration allows for a redundant rack-top switch configuration. This maintains server access if one switch fails. With dual ports, you can also dedicate one link to networked storage. New motherboards with integrated local area networking connection and controller, such as Intel's LAN on motherboard 82540EM, typically support dual 10 GbE ports. Rack-top switches connect to the network backbone with 40 or 100 GbE links.

Clouds run best on a mix of block and object storage, all connected via Ethernet. Storage tiering allows better operations by moving active data from Tier 1 enterprise hard disk drives (HDDs) onto much faster solid state drives. SSDs are typically smaller than the HDDs they replace. Cold data and Tier-1 backups can reside on slower, higher-capacity Tier 2 bulk storage drives.

To choose hardware vendors, focus on quality rather than price; quality doesn't mean expensive anymore. Look for good airflow design on the equipment, and reports to back up performance or other claims, such as a presentation on the vendor's field return rates. Some of the best quality gear comes from original design manufacturers (ODMs), which sell in huge volumes to the major cloud service providers (CSPs). These ODMs are now selling directly in the U.S. and E.U.

Cloud units achieve virtual redundancy -- a healthy unit automatically replaces a failing unit rather than letting the application fail. Redundant power supplies and keyboard/video/mouse ports for individual server debug don't fit the modern cloud. However, technicians will need to remotely power units on and off and reset them. Don't buy any gear that doesn't support remote control.

Special use cases

Not every application uses resources equally. App teams need big virtual instances for database work.

Today's sweet spot is the in-memory database, typically with some high-core-count CPUs and a lot of DRAM. A typical server is a 2U or 3U model with four CPU chips, each running eight to 16 cores and as much as 1 TB of DRAM.

Data centers increasingly provide local instance stores to databases to speed up storage operations. These should be very fast SSD or flash cards configured in a mirrored pair for data integrity. This Tier 0 storage should connect to network storage, typically over four 10 GbE links, or more depending on the use case.

Graphics processing units (GPUs) let specialized cloud servers render video services, host high-performance computing applications, or support other specialized applications. If you require this performance, build out large-instance configurations with a lot of DRAM, or specialty motherboards with additional GPU and memory. Local instance storage is also practical for this setup.

At the rack level

For improved performance, put Tier 1 storage and servers in the same rack -- bandwidth within a rack is much higher than across the backbone. Depending on use case, you may have high-performance storage and servers together in many racks, leaving bulk Tier 2 storage in separate racks.

Replacing all-HDD storage arrays with hybrid flash plus hard disk ones reduces drive count by a factor of four for equal capacity. Add in deduplication and/or compression, and that number may approach an eight or ten time reduction, substantially lowering power use.

Self-contained hardware sets

Modular systems, whether containers or pre-configured racks, are an alternative to conventional racks.

Modular rack systems or converged systems come in various flavors. Typically, they use large fans and shared power supplies for efficiency. Some designs run without chilled air, saving power. These capacity 'blocks' are pre-integrated and tested to ensure that all the hardware, OS and application components work together. Converged systems alleviate the burden of hardware expertise, and offer fast installation, since they are vendor-tested in the assembled configuration prior to shipping. The tradeoff is in cost.

Many big CSPs use container data centers, which house sets of integrated racks in a compact space. This offers even more flexible and fast capacity scaling.

Once the right cloud hardware is in place, the next step is capacity planning to support apps long term.

About the author:
Jim O'Reilly is a consultant focused on storage and cloud computing. He previously held top positions at Germane Systems -- creating ruggedized servers and storage for the U.S. submarine fleet -- as well as SGI/Rackable and Verari, startups Scalant and CDS, and PC Brand, Metalithic, Memorex-Telex and NCR.

Next Steps

Once the right cloud hardware is in place, the next step is capacity planning to support apps long term.

Private cloud deployments are not easy, but you can beat AWS at the cloud game.

Dig Deeper on Server hardware strategy