Data center density used to be an apocalyptic topic, which may be why many IT organizations are still loitering...
at 4 to 6 kW/rack density. But power and thermal management are prepared to survive >10 kW server rack designs.
Skyrocketing processor cores and chassis blade server designs made runaway computer room air conditioner (CRAC) and power costs seem inevitable. But higher density is not killing servers off as designers feared. Virtualization, energy-efficient hardware, aggressive cooling containment and higher acceptable operating temperatures have united to keep heat exhaustion at bay.
How big a problem is heat?
Instead of one server per workload, a modest server with a virtualization hypervisor supports 10, 20 or even more workloads. Facilities would have to cram servers into every open rack space to match the workload capacity enabled by virtualization.
At the same time, chips are made with denser fabrication at the transistor level and lower clock speeds, so spiraling up the number of processor cores in an equipment refresh hardly changes the rack's energy draw.
Fewer, more-utilized servers in a data center, with fewer racks, have changed how we apply cooling. Rather than cool the entire data center, with coarse air handling strategies like hot/cold aisle to economize air flow across the space, operators deploy containment strategies to shrink the operating area to a much smaller room or even box in a few racks. In-row or in-rack cooling systems handle these racks, switching off the CRACs for good.
In addition, organizations like the American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE) propose raising the effective server air inlet temperatures to 80 and even 90 degrees Fahrenheit.
With these advances in energy management, hot spots and cooling inefficiencies are unlikely, and usually indicate poorly designed or badly retrofitted facilities.
Hot spots and other cooling problems
Even with the best containment and high-efficiency cooling systems, hot spots within server racks still happen due to suboptimal selection or placement of computing equipment.
Unintended obstructions or accidental gaps in the air flow path build heat. For example, leaving off a server rack's blanking plates lets cooled air flow into the rack at an unexpected location, weakening its flow to other servers and increasing outlet temperatures.
Vast increases in server power also cause cooling problems. For example, replacing several white-box 1U servers with a high-end blade system dramatically increases the rack's power consumption, and inadequate air flow may impede cooling for a full complement of blade modules. If cooling isn't designed for this kind of server, a hot spot usually develops.
When you're increasing server rack density, invest in data center infrastructure management and other systems management tools that collect and report temperature data supplied by thermal sensors within each server and rack. They identify breached thermal limits and take necessary action, from alerting technicians to automatically invoking workload migrations and system shutdowns to prevent premature failures.
When a server rack design produces hot spots, the IT team can redistribute the hardware. Rather than filling a single rack, move equipment -- as much as half -- to a second rack if the space is available, or move off the overheating system.
If space is not available for a redesign, add point cooling devices, such as portable, self-contained air conditioners made for data center use. If the rack is tightly packed using in-row or in-rack cooling units, it may be more effective to reduce the set point temperature rather than open the containment barriers to add a point cooling unit.
Over the long term, more disruptive technologies can aid heat management.
Water-cooled racks pass chilled water through cabinet doors or other air pathways. Water-cooled server racks address broad heating problems -- especially when lower air temperatures and higher flow rates alone don't work.
Immersion cooling submerges servers into a bath of cooled non-conductive non-corrosive material like mineral oil. This technique promises high efficiency, almost no noise and long thermal ride-through in the event of power loss.
However, these point-of-heat options are better suited to new data center builds, not ordinary technology refresh cycles.