It's a challenge to derive meaningful comparisons of the expected performance of a real-world workload using different...
Vendors often carefully craft published server benchmarks to maximize performance results. While they have some value, these benchmarks often do not reflect the performance of a useful workload.
There are a few givens in the IT benchmarking game. Solid-state drives are much faster than hard disk drives, especially in random I/O operations. Nonvolatile memory express PCIe SSD or flash drives are four to five times faster still. Likewise, in-memory databases running in a large DRAM space leave traditional database engine approaches in the dust, and can achieve up to one hundred times faster performance. Graphical processing unit (GPU)-based big data analytics occur much faster than with CPU-based solutions.
Few other IT benchmarks are so easily decided -- argue the benefits of Linux versus Windows and it's time to withdraw to the pub. Many server configurations have a lot of dependencies that impact the performance of a real workload. For example, a Sparc T5 chip running at 3.6 gigahertz (GHz) should beat an Intel CPU at 2.5 GHz, but Intel runs more complex instructions, so on a per-core basis the performance is close.
IT benchmarking gets more complex if you delve deeper into the technology. Sparc T5 has 128 threads per CPU and can also move 500 gigabytes per second across its memory busses in an eight-way configuration. The XEON E7 v2 is close, and both have similar core counts. In the end, the most telling differences are acquisition price and openness. Intel wins on both fronts due to its ability to run Solaris, Linux and Windows server OSes and the potential to buy Internet-priced add-in components like extra drives, which makes a huge difference to costs over the lifetime of use in a data center.
IBM's POWER8 processor is the other CPU commonly seen in high-end data center servers. Here, a server performance test will produce markedly different results. While Power, like Sparc, has a high thread count per CPU, the raw core count is lower (12 versus 18). On the other hand, clock rates can reach to 5 GHz.
In an actual server performance test, things get complicated. POWER8 systems can be configured to 16 CPUs, by joining 4U servers together. Intel counters with the cloud approach, which interlinks large counts of servers. While clouds scale out expansively, the server-to-server bandwidth is lower than IBM's scale-up Power server model. Again, Intel points out the significant difference in prices between Power-based systems and Intel servers.
Comparing apples to oranges
In aggregate, Intel-based servers win on price for performance over Oracle and IBM systems for any server performance test except some specific cases where the power of tighter coupling for a larger core count makes sense. This still isn't the whole story for useable IT benchmarks.
Intel's lead is noteworthy, but move to slightly lower core counts and AMD-based servers compete with Intel systems. It may save data centers money to cluster a set of AMD boxes rather than run a slightly smaller cluster of Intel servers, especially if GPUs are doing most of the work.
IBM mainframes use a much more complex architecture, with 101 central processors each with quad cores. These should easily outperform one top-of-the-line Intel-based unit, but at a vastly different acquisition price. A data center could install a high-end Intel server for about $50,000, which is just enough to pay for the front panel on a mainframe.
Confining our interest to the 16 CPU and below market, other considerations in making a configuration choice hinge on add-in units. GPU cards and other peripherals make an enormous difference with analytics workloads such as Hadoop, since they allow parallel computing on a grand scale.
Network and storage performance
For many applications that use networked storage or share data in clusters, such as virtualized server farms, it makes sense to use fast networks. Today that means at least a couple of 10 gigabit Ethernet (GbE) links on each server, while eight or more CPU systems may need four by 10 GbE or even 40 GbE links if they are in network-intensive applications. In the near future, top-end servers will use 25 GbE links with 100 GbE as an option. Data centers are increasingly turning to RDMA Ethernet or InfiniBand links to lower system overhead and transaction latency.
Storage gear also has a major impact on overall IT workload performance. Hard drive RAID arrays are designed for around 750,000 IOPS, while all-flash arrays achieve between two and four million IOPS. Even more important, latency on an all-flash array is much lower than on a hard drive array, allowing for app tuning and a much better response time.
For most applications, the key IT benchmarking metric is price and performance. Here, Intel's advantage is marked, being four or more times better than Sparc and even more compared with IBM's proprietary chips. As we move into an era where everyone from server vendors to big cloud providers enter the U.S. and E.U. markets directly, prices will fall much further on Intel-based IT equipment gear.
About the author:
Jim O'Reilly was vice president of engineering at Germane Systems, where he created ruggedized servers and storage for the U.S. submarine fleet. He has also held senior management positions at SGI/Rackable and Verari; was CEO at startups Scalant and CDS; headed operations at PC Brand and Metalithic; and led major divisions of Memorex-Telex and NCR, where his team developed the first SCSI ASIC, now in the Smithsonian. O'Reilly is currently a consultant focused on storage and cloud computing.
Deep dive: SAP platform benchmarks
Is x86 right for big data workloads?
What you should know about the benchmarking trend