Don Becker is working hard to improve booting and provisioning options in Beowulf, the pioneering open source cluster project he co-founded. Improvements are all that's possible, so don't expect a silver bullet -- like virtualization -- to end booting, provisioning or device driver compatibility, he warns.
In this interview with SearchOpenSource.com editor Jan Stafford, Becker describes his recent work on Beowulf and the inevitability of hardware administration headaches. Becker is also CTO of Scyld Inc., the software division of Penguin Computing, a server vendor in San Francisco.
SearchOpenSource.com: What are you working on in the Beowulf project now?
Don Becker: I focus mostly on the systems side. We're trying to do an even better job in booting and provisioning machines on a low level. I think that booting machines is a critical process, whether you're booting the physical machine and have to load the correct device drivers or whether you're trying to provision virtual machines. Application software companies and applications-focused end users don't want to deal with that.
Almost everybody else starts out with the statement, "Start with the perfect, factory-fresh clean install and then install our software on top of that." Not everybody can be the first. Most companies don't want to deal with getting to the point of providing an install that handles the underlying hardware and boots the machine. That's an area we're improving upon, and I think it's a critical one that a lot of people don't want to address.
Is there a virtualization angle to any of your work on Beowulf?
Becker: With our cluster system especially, being able to have fault tolerance is important. In the past, we had a single master model that was a single full-install machine. Now we have multiple machines that can fulfill that same role, and we're isolating virtual environments even further. That's the same thing that virtualization tries to do with people doing Xen-like virtualizations with full installs. We're trying to keep the advantage of an isolated virtual environment. Or, ideally, each application would be in an isolated virtual environment but still be able to view everything running from an overall whole.
IT managers frequently write to SearchOpenSource.com about their booting and provisioning troubles. Do you think there will ever be a means of reducing the problems dramatically?
Becker: We'll never move beyond the booting and hardware provisioning problems because there's always new hardware coming out. Just about every month there's some changed device. For instance, you can't get the old chip anymore, or there's a new, higher performance I/O chip that's embedded in the system. You're always chasing device support. You always need to add just that one last device driver.
That will never go away, and imagining that virtualization will make it go away is a mistake. One of the things that people doing virtual machines think they're going to get is only having to provision for one ideal virtual machine. You still have to deal with the underlying hardware, however, and the underlying hardware will always be getting better and always changing. That problem will never go away.
Device driver issues are a constant for Linux users. It sounds like you see no end in sight.
Becker: I've written a lot of device drivers in my life. More than almost anybody else, I think. And the thing you find is that there's always some new device that needs a device driver.
On Ethernet device drivers, for example, it was all fast Ethernet, but there were hundreds of different designs. Only 20 or 30 unique chips; but, even there, they could implement different designs.
Computers, from the viewpoint of an application, are doing the same thing they've been doing for 20 to 25 years, and yet we always have new ways of interfacing with devices.
We're about to see another revolution, which is in network adapters -- that we [will] talk directly to [them] from application level. That's a massive change in how you interface with them. And that brings about a new round of device drivers completely unlike the device drivers we had 10 years ago. So, that part of the world isn't going to stabilize anytime soon.