Do you wait around for servers to finish jobs? Are servers hitting 90% utilization? Are disk drives pounding their...
If you answered yes to any of these questions, your server utilization is imbalanced and it's probably time for some data center hardware tuning.
Relatively inexpensive fixes
DRAM is one of the easiest system boosts available to fix server utilization problems. Jobs are bigger and operating systems leave more data in memory than past generations did. In most cases, doubling the DRAM makes a server behave much better. DRAM expansion can reduce page-swapping significantly, freeing up IO for other things.
Assess the disk drives if servers are overwhelmed. If one drive is spinning away while others are running with low load, the swap file is a prime suspect. Expanding DRAM would have fixed this problem, so if it continues, look for imbalanced load concentration. Either spread the working data over more drives, or replace the most heavily loaded disks on servers with higher performance solid state drives (SSDs).
If you're spreading out the data to ease the server utilization load, a RAID controller is the best option. Internal RAID controller hardware is inexpensive. RAID increases disk speed by a factor of five or less, so it's not in the speed league of an SSD, but can be much cheaper. Software-based RAID is another option. Always back up data before creating RAID volumes, as there is always a chance of something going wrong.
SSDs are beneficial in many situations. Often, servers are starved for storage. While hard disk drive storage capacity has increased from 256 MB to 8 TB in the last 30 years, throughput (IOPS) only increased from 50 to 150 IOPS per drive. SSD's IOPS in the 400,000 range can solve the Moore's Law storage gap problem.
Vendors are quite happy to sell enterprise-class SSD at high prices, but a mid-range SSD is much more affordable and, while it won't deliver millions of IOPS, the performance will be much higher than what the server's disk drive delivered.
Another way to save money is to use a smaller-capacity SSD than the disk it replaced as the primary drive. Many data center workloads only call for a few tens of gigabytes of active storage per server. The rest can stay on the existing hard drives. With just a 256 GB SSD versus a terabyte, upgrades are more affordable.
Improved networking can unlock server performance problems as well. For example, 1 GbE is slow for networked storage, but upgrading the whole server farm is expensive. In some data centers, most of the racks already use 10GbE, and applications experience a performance payoff from upgrading the remaining servers. This is especially true if the servers are virtualized.
The containers approach to virtualization allows more than twice the number of instances per server than standard virtualization, and reduces networked IO enormously. Containers look to be a good layer to host low-grade tasks such as Web serving, but suffer security and maturity handicaps.
It might be time to evaluate your server real estate and move some tasks to a cloud provider. Targeted migration will avoid new server purchases and offer a low-risk way of getting cloud-savvy.
Big machines for big jobs
The next tier of improvements for a stressed server deployment is more complex: right-sizing and specializing servers. New technology merits a major rethink of how jobs like big data and databases are run.
Big-data systems architectures have evolved rapidly from big, fast servers to in-memory systems to GPU-based acceleration. In each hardware change, you can do much more with fewer servers, even though those servers increase in price.
Performance boosts of as much as 100x make this is a cogent argument, but the step from big systems to in-memory systems often involves buying new 4- or 8-CPU servers with 512-TB DRAM capability.
On the other hand, adding GPUs to servers only requires spare PCI/PCIe slots, and gives a major boost irrespective of server size. The architecture that works best depends on many variables in any given use case. Regardless , a fast SSD for top-tier storage is mandatory and PCIe drives are likely the right choice.
Relational database management systems follow many of the same arguments as the big-data approach, but specific software -- Oracle's products are a good example -- is highly tuned to server configurations. Vendors can provide valuable guidance for hardware upgrades. The latest hardware relies on in-memory databases, a fast SSD and remote direct memory access networking for storage caches.
Techniques like load balancing and virtualization to replicate an app across multiple servers also helps remove hot spots in the server farm. However, it's a complicated issue, with capacity considerations.
About the author:
Jim O'Reilly is a consultant focused on storage and cloud computing. He was vice president of engineering at Germane Systems, where he created ruggedized servers and storage for the U.S. submarine fleet. He has also held senior management positions at SGI/Rackable and Verari; was CEO at startups Scalant and CDS; headed operations at PC Brand and Metalithic; and led major divisions of Memorex-Telex and NCR, where his team developed the first SCSI ASIC, now in the Smithsonian.
Optimize performance on a virtual server
Increase efficiency with container virtualization
The importance of application performance management