Seeing gains made by Linux in the corporate application clustering segment, Microsoft came on strong with its own cluster offerings. But is a Windows cluster strong enough? Clustering expert Bjorn Skare thinks not.
Skare, president and CEO of Linux cluster software vendor Scali Inc., Marlborough, Mass., defines the differences between Linux and Windows clusters in this pre-LinuxWorld interview. He also offers tips for deploying clusters in mixed Windows/Linux environments and names some mistakes IT shops make in deploying clusters.
Scali is demonstrating a new, major upgrade to its Scali Manage and Scali MPI Connect suites at the LinuxWorld Conference & Expo.
Can you offer some technical and best practices advice for deploying and managing a Linux cluster in an environment that includes Windows?
Bjorn Skare: This is really an application issue. You need to make sure that the application you need can run on a Linux cluster. If that is the case, interfacing to the cluster and managing it from your Windows environment should not be a problem. Most cluster management solutions provide a Windows client or are Web-based, and even if they are not, you can always use an XWindows software package.
What are the typical issues that come up in integrating Linux clusters in a Linux-Windows environment? What are some fixes to those problems?
Skare: File sharing is a common issue; you typically need to share files between your Windows environment and your Linux cluster. For example, you might do some initial processing on your Windows workstation and then transfer a file to the Linux cluster for further analysis. This is easy to address because Linux supports CIFS [Common Internet File System] -- Windows file sharing protocol -- and you can also get NFS clients for Windows, though the latter is typically more expensive.
Why use Linux for clusters instead of Microsoft's clustering options?
Skare: The application very often drives the decision. Many parallel applications are not available on Windows. Many customers also find that applications perform better on Linux clusters compared to Windows clusters. One reason for that is that Linux can be customized to fit the application and reduce OS overhead by stripping out parts that are not needed. It's not by chance that on the Top 500 supercomputers list there are 334 Linux-based systems, and only one Windows-based system.
In addition, today Linux clusters are easier to manage because of the availability of remote administration protocols, such as secure shell [SSH]. That might change once Microsoft ships its Windows Server 2003, Compute Cluster Edition.
In your opinion, how well does Linux stack up to Windows today in adoption/acceptance, functionality, security and application availability?
Skare: First, people often compare Windows to Linux as if these were the tradeoffs that companies and users are making. In most situations, this isn't the case. Most of the growth in Linux seems to be coming from the Unix world. That said, Linux is a robust, reliable and secure solution. Applications that have historically been strong on Unix are moving rapidly to Linux and providing customers with the ability to leverage the price performance advantage of industry standard servers, without being locked into a particular hardware architecture. A clear example of this is in the high-performance computing area, where Linux clusters are becoming the standard deployment platform for applications such as oil and gas exploration, pharmaceutical drug discovery, automotive crash test analysis and financial risk analysis.
Are Linux clusters being adopted by corporations and not just by the scientific community?
Skare: Linux clusters are being adopted by many corporations today. Linux clusters are enabling more and more companies to get access to supercomputing-level performance. This democratization of supercomputing is allowing companies and users to approach their businesses in ways that they couldn't consider in the past.
Examples include the growth of product simulation in product development organizations. In the past, many of these groups couldn't afford to do simulations to improve their designs. With the price performance of Linux clusters, they can do this kind of work, enabling them to get better products to market sooner. In financial services, smaller firms are able to leverage clusters for risk analysis and Monte Carlo simulations to drive trading decisions in a timely manner.
In the past, only the large firms with large IT staff and budgets could perform these kinds of calculations. Companies are looking for advantages and Linux clustering enables them to do things they could not have done effectively in the past.
What's made clustering more attractive to businesses?
Skare: Businesses are ready for clusters today because deploying Linux clusters has become simpler and [the products are] more readily available. The tier-one hardware vendors are selling pre-defined cluster configurations.
On top of that, companies such as Scali are working with major players in the industry to create validated configurations that include the entire hardware and software stack, up to the application, so the customer does not need to take the role of a system integrator and is assured that all the components work together.
In addition, the availability of mature, commercially supported cluster management tools reduces the cost of cluster deployment and ongoing maintenance.
What is the biggest mistake IT shops make when implementing Linux clusters?
Skare: Many approach implementing clusters with a very hardware-centric view of what a cluster consists of. It turns out that often the servers and the hardware are the easiest part of the process. Remember that a cluster, in addition to the hardware, also has an operating system, kernel updates, specialized interconnects and drivers, storage considerations, workload management challenges, applications to be deployed and more.
The biggest mistake they make is not understanding that all this other 'stuff' is what can make clustering complex. They need to have a clear understanding how they will install, manage and maintain this environment. Ideally, their approach should consider cluster management as a key consideration right from the start.