Linux clusters getting more mainstream

In this interview, Linux Networx executive Eric Pitcher describes the growing acceptance of Linux-based clusters in the enterprise and the challenges of managing them.

Clustering technology can be a relatively cost-effective way of bringing high-performance computing into an enterprise...

that needs to perform large and complex calculations and transactions. Networking Intel-based Linux servers, which scale exponentially, serves budget-strapped enterprises very well. In this interview, Eric Pitcher, Linux Networx's vice president of product marketing, talks about where Linux clusters fit in the enterprise, the kind of company that should consider this type of investment, and the challenges of managing Linux clusters.

Clusters are mostly used in scientific and engineering circles right now. How can the technology be used in the enterprise?

Eric Pitcher: High-performance computing is much more mainstream now, in the commercial sector, with automotive companies, aerospace companies and a number of other commercial entities using Linux clusters. Going beyond the areas of engineering, there are other applications of clusters today, ranging from clusters of servers used to manage massive Web sites [to] clusters that are running database applications -- Oracle, for example, has a suite of software that runs on cluster systems. They are starting to become more and more pervasive in other parts of business applications.

What kind or size of company should make that kind of investment?

Pitcher: The good news about clustering [is] because the underlying technology is highly leveraged -- in the sense that Intel produces millions of computer chips a year that are used not only for high-performance computing, but other types of applications as well -- we can take advantage of the development cost of those commercial technologies and produce systems that are, from a cost-performance standpoint, much superior to systems based on proprietary technology. Businesses are looking at spending a lot less money for comparable computing power they got from proprietary systems a few years ago.

You can get a cluster system for a few tens of thousands of dollars and then scale it on up to multimillions of dollars, depending on the computational need.

What kind of jobs would a Linux cluster handle in the enterprise?

Pitcher: Engineering applications. The automotive industry uses clusters today for crash simulation -- creating a numerical model of an automobile crash. They also use them for fluid dynamics studies in the automotive sector for reducing aerodynamic drag on automobiles. Large aerospace companies [and] aircraft manufacturers also use it for fluid dynamic studies.

The life sciences are rapidly adopting clusters for the whole drug discovery process, primarily in computational chemistry and the underlying genome sequencing.

Other applications are in the oil and gas sector, doing seismic processing and reservoir modeling.

There are many rocket scientists on Wall Street running calculations designed to support risk management. A number of Wall Street firms are using clusters for that type of application.

How easy or difficult is it to manage a Linux cluster?

Pitcher: It's very difficult if you don't have the right management tools. If you've got a hundred or a thousand computers that are all independent, then you need management software to make that a productive system.

Then there's the whole business of being able to manage a diverse workload. You need to have scheduling software and resource management to use all the CPUs and computing elements efficiently.

How do cluster management tools reduce TCO for clusters?

Pitcher: Think of a graph, with time running along one axis and a number of CPUs running along another axis. Then you have this resource made up of little squares of time and CPUs. You want to be able to fill that resource with as much work as possible, not have any open squares, which represent unutilized time on a particular CPU. So the scheduling and resource management looks at all the jobs that are in the queue waiting to be run, and then maps those jobs against the available resources. The mapping takes place automatically, so it doesn't have to be done manually, the way it used to be in the old days before there was full management software.

Doing that reduces the TCO of a cluster because you're getting more out of it with management software than you would otherwise.

What are the technology challenges of bringing a Linux cluster into a heterogeneous environment?

Pitcher: I don't see any insurmountable challenges. I suppose training, although Linux is just another version of Unix, so it's nothing new and strange.

What is in store for clustering in the enterprise?

Eric Pitcher: The cluster market -- according to IDC -- for high-performance computing is growing in excess of 20% per year compounded annual growth rate. And that's expected to continue for the next several years. It's a clear indication of the way things are moving.

What's behind the increasing usage of Linux clusters?

Pitcher: What we're seeing is that, as customers look to refresh their systems, they're looking more closely at Linux. For example, we have a system at the technology center of Oracle, in Atlanta. The head of that center indicates that 75% of all prospects and customers that come through there are looking at Linux as a viable alternative.

How would the costs of a Linux cluster compare to a proprietary system like Unix?

Pitcher: For equivalent work power, the cost of a Linux cluster is perhaps one-fifth to one-tenth the cost of a proprietary system. Of course, that depends on the application.

