When a company installs a Linux cluster, do its systems administrators live happily ever after? Tom Quinn, business development director for Linux Networx in Bluffdale, Utah say the answer is a qualified 'yes.' In part two of this Linux cluster primer, he offers advice on simplifying cluster management and explains the pros and cons of recycling servers in a cluster.
This story originally appeared on SearchEnterpriseLinux.com.
Could you offer some tips to handling common but tricky problems that can plague administrators handling Linux clusters on a daily basis?
Tom Quinn: In poorly-designed systems this could involve using a portable terminal or 'crash cart,' reloading and reconfiguring the node's operating system, loading and configuring the correct cluster software stack, and finally loading or mounting user applications and data onto the node.
If the cluster is configured properly from the beginning and the correct tools are installed, there is very little an administrator needs to do on a daily basis. Most problems are going to be encountered with users and applications, such as managing job queues, prioritizing jobs, managing data and files, backing up accounts, etc.
In other words, administrators will have the same problems encountered with any shared IT resource. For very large systems, occasional hardware failures are bound to happen. And when they do, an administrator will possibly need to re-configure a repaired compute node once [its] re-instated into the cluster.
Again, with the right tools in place, this can be as easy as powering on the node and clicking a button.
Some companies tell me that they lack the in-house Linux administration skills needed to use Linux clusters. Isn't this a big obstacle to adoption?
Quinn: To get successful results from your cluster system, you will need administrators with an understanding of the application and the ability to modify it, general Linux system administration skills, an understanding of networking and network topologies, and eventually, a familiarity with parallel computing and clustering. If this expertise currently isn't available, there are many commercial training courses and professional services specifically designed to help round out the skills necessary to make your cluster work.
How can Linux clustering extend the life of an IT system comprised mostly of older servers?
Quinn: It is possible to re-cycle older servers by incorporating them into a Linux cluster environment. You can create a 'new' cluster of retired servers or incorporate them into an existing cluster.
One of the drawbacks of using older servers, however, is that the manpower, software and/or networking hardware resources required to shape the system into something productive in most cases exceeds the cost of buying newer equipment. Since processing technology accelerates so quickly (e.g. Moore's law) and hardware prices stay relatively constant, using older servers is many times not worth the trouble.
Can you describe a specific Linux clustering implementation that resulted in increased performance for the company?
Quinn: Orbital Sciences, an aerospace company that develops rockets and satellite systems, relies on massive amounts of computing power to optimize and simulate launch vehicles for reliability and accuracy. Orbital's existing computer system was continually bogged down and unable to produce fast, accurate simulations.
To help meet deadlines, Orbital often had to outsource smaller simulation problems, which could cost up to $50,000 per job and $300,000 annually. The inefficiencies and frustration caused by its existing system led Orbital to investigate new computing architectures.
Orbital wanted to migrate to a Linux cluster because of the industry-wide reputation that clusters [have for providing] optimal performance for Computational Fluid Dynamics (CFD) applications. Orbital chose a Linux Networx Evolocity cluster system pre-integrated with a CFD application from Fluent, Inc.
Since implementing the cluster, Orbital is able to complete jobs 30 times faster than their previous solution, which significantly reduces the number of problems Orbital has to outsource to third parties.
Orbital engineers estimate the Linux Networx CFD cluster has more than paid for itself within the first year because of its fast results and the reduction in outsourcing needs—a savings of over $130,000 in less than one year.
More importantly, with the power and capabilities of the cluster, Orbital engineers now have the luxury of routinely running small tasks and achieving much more focused and valid results than they were able to accomplish previously.
Can you describe one that resulted in increased scalability?
Quinn: Unlike other architectures, clusters scale naturally, and with switches from interconnect vendors now reaching well more than 200 ports in a single switch chassis, the scalability is enhanced even further.
Weinman Geoscience is taking advantage of the scalability of Linux clusters. Weinman provides expert exploration consulting and seismic data processing services to the energy industry. Before switching to a Linux cluster, Weinman's computing power was not nearly enough to complete the size of jobs they needed to run.
Once their first Linux Networx cluster system was installed, Weinman was immediately able to pass the benefits of high performance computing directly to their customers. For example, one company had drilled four consecutive dry holes at a cost of $5 million each. After coming to Weinman Geoscience, they were able to process seismic data that led the customer to hit eight for eight good wells-leading to the discovery of a half billion barrels of oil.
The cluster not only helped their customer find new hydrocarbon reserves, but also made Weinman's computing price/performance more competitive with larger contractors.
The success Weinman has experienced with their cluster has led them to seamlessly scale their cluster system multiple times for increased performance and faster results.