We at Linux Networx agree that clusters do exist of the type that Dr. Terry has described, but sophisticated users demand much more from their cluster systems. On the surface, Linux clusters are composed of a collection of microprocessor-based servers. However, Linux clusters can achieve high efficiency and high productivity if the sub assemblies have been fully tested and validated, and if the cluster is delivered with management tools, integrated applications, and professional services.
Organizations are quickly realizing the many advantages of Linux clusters, which account for the technology's enormous growth. IDC recently reported that 25% of all worldwide high performance computing (HPC) shipments in 2003 were clusters, and that number continues to grow.
In designing a Linux cluster, selection of the correct interconnect for the customer's application, as well as the implementation of the system as a whole, is important in achieving a high productivity system. Advances in high-speed interconnects are continuing to improve the efficiency of clusters. There is now a range of interconnects from which to choose, such as Myrinet, Quadrics, and InfiniBand, that impact a system's efficiency.
The key point to focus on is not floating-point-operations-per-second (flops), but rather how productive the machine will be over its life. This is why Linux Networx focuses on building high productivity cluster computer systems, rather than systems that are only capable of running a fast Linpack benchmark number to make the Top500 list. The organization that is using the computing system for virtual product development, or key research, must have a machine that can achieve maximum sustained performance over its life to provide the highest return on investment (ROI) possible. What's right and wrong with the argument that the performance of Linux clusters, where processors are connected through I/O links, is severely limited by PCI bottlenecks?
All computers have bottlenecks that are exposed by certain applications. The relevant question is what percentage of HPC applications can be run cost-effectively on clusters today? That number is quite large, as evidenced in the rapidly rising cluster sales.
To answer the question, replacing the PCI bus from the interconnect may contribute to better performance for some HPC codes. However, the interconnect in an HPC system is only one element of the total solution. Other important elements contributing to performance include the CPU, the chipset, the compiler, and the global file system.
Interconnect vendors continue to make great strides in addressing the needs of the HPC community. As interconnect vendors continue to make strides in delivering higher bandwidth and lower latency interconnects, an even greater fraction of applications will run cost-effectively on clusters vs. proprietary systems.
Linux cluster benefits still include great flexibility and scalability -– price will not change this. Cluster computing takes advantage of commodity hardware and is able to ride consumer PC trends for lower prices on processors, memory, drives, etc. Traditional supercomputers are unlikely to ever achieve the volumes that greatly benefit Linux clusters.
In defense of supercomputers, there are still some applications that don't run well on cluster systems. Some applications cannot be easily parallelized, require shared memory, or contain a high-performance graphics pipeline that prevents the application from being ported to clusters. Customers with these types of needs may not benefit from clusters. However, they may be able to save some money by partitioning the processing and doing some pre- or post-processing on a Linux cluster. What are the scalability advantages of Linux clusters?
The scalability advantages of clusters have been widely publicized. It is no accident that more than half of the 20 fastest computers in the world are clusters. Unlike other architectures, clusters scale naturally. And with switches from interconnect vendors now reaching well more than 200 ports in a single switch chassis, the scalability is enhanced even further. Is the Top 500 listing a real measure of HPC systems' abilities? Doesn't it just test total processing power and not application performance and system efficiency? Does it have other shortcomings or strengths?
No standard benchmark is capable of predicting application performance on the wide variety of HPC codes that exist. Linpack measures a certain important subset of a system's overall performance profile. Linpack benchmark tests correlate well with the performance of some HPC applications, and poorly with others.
Other benchmarks capture a different, and in some cases a more holistic, subset of a system's overall performance that may be useful for predicting the performance of other applications. However, no single benchmark will be able to accurately predict performance on the large number of HPC codes that exist. There is no substitute for benchmarking the actual application that will be run by users.