It seems that every day brings exciting new server technology advancements. Today’s servers are far more powerful than just a few years ago, using enormous concentrations of processing power and memory, while also taking unprecedented steps to mitigate power use. The real questions are: How do IT professionals perceive these advances and use them in their own environment?
Stephen Bigelow, senior technology editor, sits down with Bill Kleyman, director of technology at World Wide Fittings Inc., to gain insight into these questions and more server management topics.
You can also listen to a podcast of this Q&A here:
Bigelow: High-volume, low-power computing has made a really big splash recently. Systems are appearing with low-power Tilera processors, and Intel Atom boxes are showing up with 500-plus processors in them. How do you see this kind of technology fitting into data centers at everyday businesses?
Kleyman: Well, let’s talk about the Tilera processor for just a little bit. It’s a phenomenal piece of technology that creates the kind of server density that some high-end data applications are requiring, and some high-end programmers are looking to see. The Tile64 family of multi-core processors is named Tile 64 because you’re looking at literally 64 identical processor cores – they call them tiles – that are interconnected on the actual chip. What this means for the end user, or the IT admin, is that each tile is a complete full-featured processor that has an integrated L1 and L2 cache right on board. So one proc will have 64 identical processor cores, and you could run 64 individual full operating systems on each core. You’re talking about putting some of these procs on a server.
Fairly recently, I believe they released a server with Quanta called the S2Q. The potential there is for 10,000 cores to be packed on the standard rack server. That kind of server density is just unbelievable. The other big benefit is these processors are not high-power consuming. So you’re going to get a lot of power, a lot of high-core density within the machine, and you’re not going to end up taking a lot of juice from your server rack.
There are a lot of beneficial things to take away from the Tilera, or even the Intel Atoms that have multi-core processors on the servers. Let’s take a look at programmers. These Tile64 procs are programmable in ANSI C or C++, which gives developers unbelievable leverage to maximize current software, and their investments moving forward. I’m looking forward to seeing a lot more manufactures adopt these processors and really start pushing them out into the server environment.
Bigelow: How would you manage a server with that concentration of computing power? Would it have any effect on your management tools or your management tactics? Is it really just a matter of fewer workloads utilizing an enormous amount of processing power or, as you alluded to earlier, is it really a matter of managing a great deal more workloads given the number of available processors?
Kleyman: That’s a really good question. You’re talking about a single processor with 64 cores embedded on it. What if you’re talking about multiple processors? What if you have half a dozen of these processors sitting on a single server? Now you’re talking about multiple cores, and some of those cores are operating their own operating system. Or you can use multiple tiles or multiple cores for a multiprocessing OS. How do you start managing these? It really all [comes down to] the application of the server.
For example, if you’re running an OS per core, you still have the options for the server for IPMI 2.0. The management tools are certainly still there. If you’re putting this kind of processor or server into a virtual environment, you’re looking to work with already embedded tools that are very granular in itself. Within VMware, for example, or Citrix’s XenServer, which look at processor capacity, processor speed, how it’s being utilized by the VM and even from the GUI on your virtual application or your hypervisor application, you’re able to throttle the memory and perform dynamic memory balancing, as well as have processor control and know how many processors you need to allocate to a specific VM.
So the real question is: How are these servers going to be used? Are they going to be used in a virtual environment or standalone for development environment? That’s really going to help you gauge how you need to control this. For the most part, there are a lot of tools that are currently available, like I mentioned earlier, you can use an IPMI 2.0, as well as embedded tools to help manage the power and consumption of these processors.
Bigelow: The next area we’re really interested in is memory. Obviously, servers rely a lot on memory; they provide a tremendous amount of memory, but we’re seeing flash memory appearing on large servers to support critical, memory-intensive workloads. We see some examples of this in the IBM Smart Analytics System 5600, the IBM WebSphere DataPower XC10. Do you see a need for enhanced local memory on a server, for example, flash memory as opposed to a hard drive, to boost data accessibility and take better advantage of the processor? Or is that type memory use still a computing niche for a few select companies?
Kleyman: It’s approaching a much clearer light in terms of understanding how flash memory plays a role in the server environment. Personally, I feel that using flash memory on a server is a great idea, and we’re definitely going to see a lot of that. To get a little bit more detail, let’s take a look at the IBM systems, both that you mentioned earlier, the Smart Analytics System 5600 and the WebSphere DataPower XC10. These machines are designed for absolute maximum performance, and they have a price tag to match it, and any degradation could be detrimental to the workload. So essentially, what IBM did with onboard flash memory is offload some of that work from its hard disks onto the onboard flash memory. Finally, what IBM is doing is using this flash memory to offer disruptive writes to temporary locations -- for example, sorting, or sequel database or something of the sort is doing a high-end complex calculation, or whatever it might be, to further enhance the performance of complex queries that are done on the server. Basically, you’re using flash memory products to ensure that queries are processed quickly and overall analytics response times shrink for the server.
Now, you mentioned the XC10, another great example. Here is a device that is used very heavily for enterprise applications. It can be deployed as a standalone appliance or as a group of appliances. Or usage can range from simple data grids to WebSphere applications, all the way to simple data grids accessing general applications within the environment. Keeping data in cache or flash memory, as opposed to disk, avoids unnecessary disk storage and retrieval of data, while active within the set of enterprise applications. So, while you’re accessing the data or certain read and writes are being done, some of that information can be stored on flash memory so that these enterprise applications don’t see a degradation in performance. I think that’s a great idea.
Bigelow: How do you see the use of this type of flash memory affecting server virtualization levels? I can see how this accelerates a workload’s performance, but does this kind of approach really facilitate greater levels of server consolidation, for example?
Kleyman: That is a little bit tougher to answer. Achieving greater levels of consolidation is probably a conversation or, if you ask some IT people, an argument. The point is, if you’re using flash memory, you are removing spinning disks, and that’s a point of failure that is all of a sudden removed with solid-state technology. But the technology isn’t quite far enough for it to be enterprise ready. When we start making full enterprise-ready [solid-state drive] SSD [storage area networks] SANs for everyday data center deployment, that is when we can probably start talking about more virtualization benefits.
Because the rate of failure is less than on a spinning disk, administrators are probably going to be braver in putting higher amounts of workloads on a SSD. And from a consolidation perspective, if you have more confidence in your drive you can put more workloads on it and space it out a bit more evenly on the server. So yes, you can definitely consolidate with that. I’m pretty confident in saying that over the next few years, billions of enterprise dollars are going to be shifted from rotating disks to flash memory solutions.
However, there is a really important note of caution here -- just because there are no spinning disks does not mean flash memory is flawless. There are absolutely concerns about SSD durability. Obviously these are being addressed by the industry, but there is one truth that needs to come out and that is SSD, solid state technology and flash memory have a finite number of write cycles. So whenever you are doing long-term system deployment or anything long-term for the future, you have to take into consideration whether your environment is going to be doing a huge number of update cycles, then you need to make sure you plan for that. That is very, very important.
Bigelow: When we talk about server technology innovations, I think we really need to say something about server power management, as well. I’ve heard a lot of interest in migrating workloads and powering down underutilized servers during off hours. How realistic are tactics like these today, and what are some lingering problems with those types of tactics?
Kleyman: Data center [managers] are seeing higher power utilization, and they’re trying to [mitigate] that through virtualization, better processor technology, better hard drives and greener technology. But this is still a little bit of a grey area. The question is: How can we manage power? Can we throttle it in some way? Can we shut down computers if they’re not being used? In reality, it is still a grey area, but it is an important one.
There is a great software tool that I’ve had a chance to toy around with from a company called 1E. It’s called NightWatchman Server Edition. It really helps manage power from a granular perspective. Let’s say you’re a data center admin with numerous racks housing dozens, if not more, physical servers. Managing the power of these racks is probably in your top 10 list of important things to do. Software like NightWatchman Server Edition can designate important and not-important workloads. From there you are able to specify low-usage times, and how the software will reduce power consumption, by asking the processor itself to run at a lower speed. The other thing you are able to do is manage virtual sprawl. It is kind of like server sprawl, but with VMs popping up everywhere. You are also able to select processes in I/O, where power savings is more important than performance.
In all honesty, it makes me nervous to think about an automated software solution tool that may power down my servers because it feels like it is being a power hog. I can even provide an example. Let’s say you’re in an environment, there is an emergency and a new server gets spun up, and for some reason, that new server wasn’t designated high or low power. Some software tool sees this new server taking up an immense amount of power and decides, ‘Hey, let’s put it in low-power state, or let’s even power it down, because we don’t know what it is doing or why it is up.’ Then, all of a sudden you have a down emergency server.
There are ways of managing power without having to power down a machine. With software tools, like those from 1E, you are able to place these machines in a low-power state, while still allowing them to run necessary options. If you are able to do that and actually lower the cycles that a processor is taking throughout operation during a low-power state, you are averaging a savings of 12% or more in reduced energy.
I would be wary of any product that would turn a machine on or off on its own, and I’d probably be even more scared if it made it operationally ineffective. Having a server in a low-power state, while still being able to run processes, is definitely a big plus. But powering down a machine, I don’t think I’d have that rolled out in my environment.
Bigelow: What other server technology advancements do you see coming down the line and affecting your data center in the next, say, six to 12 months?
Kleyman: That’s a good question. There is a law, it’s called Moore’s Law, which states that the number of transistors that can be placed inexpensively, that’s the key word, on an integrated chip or circuit doubles every two years. This trend is definitely applicable to things like processor speed, memory capacity, sensor technologies and so on and so forth. With that, I am honestly looking forward to three really crucial, big steps in the next six to 12 months. Those are: CPUs, memory utilization and how it is being distributed, as well as size and capacity, and storage technologies, which we talked about earlier in the form of flash memory and SSD. That’s going to play a huge part in my data center. With better processors, more memory and even better storage, I’m able to do so much more with a bigger bang for my buck. One of the biggest effects I’m going to see is in my DR environment, with the capabilities of being able to host maybe two or three physical servers in a remote location in a literal hot site with processors and memory, which is deemed less expensive but still very powerful. I can replicate my entire environment over a WAN and not have to blow my budget for the entire year. CPU, memory and storage technologies are going to come a long way in the next six to 12 months, helping IT admins be to do a lot more with their environments.
This was first published in April 2011