Capacity planning is one of those IT buzzwords that appears very simple to define – but, is not easy to understand. Often, this simple phrase can literally strike fear into the hearts of even the most competent of system administrators. The bottom line is that capacity planning is too often overlooked by systems administrators until they need to justify additional hardware resources. This is absolutely the wrong time to start thinking...
about it, because at the you are paid to be proactive, not to put out fires that you may have created.
Effective capacity planning is a proactive way of ensuring that your systems will not prematurely run out of space or horsepower. Here I aim to give you a better understanding of capacity planning, while also exploring the tools that can assist you in your capacity planning efforts. We're gong to define capacity planning and look at tools you can use today to help generate historical data that will allow you to estimate future growth. We're also going to discuss tools available to help you size new systems, another part of capacity planning. Finally, we'll touch on some application-type capacity planning tools that will allow you to look at data from an application or database perspective, rather than the OS. If possible, drilling down in this fashion is actually the best form of estimating future growth
What is data center capacity planning?
Let's first define the term capacity planning. While the definition itself can vary depending on whom you are speaking with, I like to define this as a process of estimating future capacity/growth of your system. The system itself can be the horsepower as it relates to your CPU/RAM or storage. If we're looking at disk I/O. It's all about being able to provide for the service level agreements (SLAs) that you've already committed to.
Here's a quick example: The reports that you have been running on your Unix or Linux systems -- to help track workload and monitor performance -- show workload utilization increasing 25% per year and performance levels decreasing 5% per quarter. Based on this analysis you should quickly be able to estimate what it would take to keep your systems humming at acceptable service levels. At the same time, if your storage has averaged a 40% per year increase over the past three years, you should also be able to approximate (don't forget to budget dollars) the amount of storage required to ensure that you won't be running out of space. But, I'm sure as a systems administrator; you've never run out of space, right? Let's also make a distinction between capacity planning and performance analysis. A capacity planner will look at ways to handle the growing demands of the business while the performance analyst monitors and tunes the servers and looks for activities that might prevent bottlenecks. Oftentimes, the system administer has both these roles and more.
Generating historical capacity data on your servers
What are some of the tools you could be using to generate historical data on your systems? The bottom line is that you should have some way of measuring the data, reporting, and analyzing it in a spreadsheet. My preference has always been to start with tools which have been created or optimized for your particular flavor of Unix or Linux. For example, if you are running IBM's AIX (or Linux on IBM's Power platform) I would use a combination of nmon and the nmon analyzer, which allow one to track historical performance data. The nmon analyzer provides the mechanism to bring data into easy to understand custom developed Excel spreadsheets. It also integrates well with an open source database called Ganglia, (which can be used outside of IBM's Power AIX or Linux systems), that displays graphical data on a website, including configuration and performance statistics. AIX also has a performance tool called LPARmon, which allows you to monitor a System p logical partition on Windows or Linux. You can also save the data to files for further trending and analysis.
If you're using Sun's Solaris, I would recommend a combination of using /proc, corestat, sysperfstat and nicrec. Corestat is a tool which is used to monitor core utilization. It's a Perl script which forks the cpustat command at run time and then aggregates the instruction count to derive the core utilization. Originally developed for the Sunfire series 1000/2000 and T5220/T5120 it is also available for SPARC64 processors as well. Sysperfstat prints utilization for CPU, RAM, disk I/O and network. You should also use the SE Toolkit, a collection of scripts for performance analysis that provides advice on performance improvement. The product itself has been the standard in system performance monitoring for Solaris over the past decade
With HP's Unix (HP-UX), you can use top to display and update system information. More comprehensive tools such as GlancePlus and MeasureWare/Perfview are also available for your perusal. GlancePlus is a performance diagnostic tool. In the MeasureWare/Perfview utility, MeasureWare collects performance data while PerfView displays the results.
For Linux I like using the sysstat utilities, which are a collection of performance monitoring tools for Linux, including; sar, sadf, mpstat, iostat, pidstat and related sa tools. I've always had a preference here for mpstat. Note that some of the Linux distributions also offer their own tools for performance management and capacity planning, many of which are RPMs and also available on other distributions. Red Hat Enterprise Linux 5 offers the following tools out of the box: SystemTap, frisk and Oprofile.
What about Unix/Linux generic tools that are available on all Unix flavors and Linux distributions? Is there such a thing? Absolutely! If you've decided to go this method for consistency and cost drivers, I would use the system activity reporter (sar), quite possibly the oldest Unix monitoring utility there is. I really like sar and with its more powerful colleague sadf, you can do reporting from the data that sar retrieves. You can just leave it running in cron. Sar is available on every version of Unix or Linux, though you may have to download the sysstat utilities for your Linux version if you don't see it installed.
I would be remiss if I didn't mention of some high-end systems such as BMC Performance Management, made specifically for the Fortune 500 type of client that requires fancy reports and more transparency for estimating growth. The product comes with an add-on (which is far from free) - Perform & Predict. This BMC product looks at individual processes/threads, and can break out the data you need with its workload profiling. Other commercial based tools include those available from TeamQuest and CA Unicenter. These high-end packages are not cheap and depending upon the complexity of your environment, may require dedicated administrators to administrate the system. One plus is that these tools are available for all flavors of Unix and Linux. Another tool worth a mention is a product called SarCheck, a lower-priced commercial tool developed and sold by Aptitune Corporation, which helps system administrators with Linux and UNIX performance tuning. It does this by analyzing the output of sar, the /proc filesystem, ps, and other files, while also reading more information from the kernel. It identifies problem areas and if necessary, recommends changes to the system's tunable kernel parameters and hardware resources (i.e., CPUs, disks, and memory).
New data center system sizing guidelines
Another side of capacity planning relates to systems sizing. When you are building a system, you don't usually have the luxury of referencing historic data.
In this case, what do you do? How do you size new servers when you can't use the capacity planning tools discussed above because there is no historical data? This is where using some of the vendor tools can help you. IBM has several tools to help with sizing and capacity planning.
One tool is the IBM Systems Workload Estimator (WLE), a tool that helps you size your server for anticipated need. It provides several scenarios to find the best sizing solution, including recommendations for CPU, RAM and I/O. The tool is also fully integrated with IBM's System Planning Tool (SPT), a capacity architectural tool that allows users to architect Power servers (running either Unix or Linux partitions), from a Windows based computer. These tools enable you to plan a system based on existing performance data or based on new workloads. You can create an entirely new system configuration, or create a system configuration based on any of the following:
- Performance data from existing systems that the new system will replace
- Performance estimates that anticipates future workloads – ones that you will that you will support
- Sample systems that you can customize to fit your specific needs
Another tool provided by IBM to its internal consultants and business partners is the IBM System p consolidation guide. This actually uses benchmarks from IDEAS International and Standards Performance Evaluation Corporation (SPEC), which helps you determine the amount of horsepower you will need to support given workloads. What I especially like about this software is that it allows one to plug in values for a different vendor's hardware system, and the spreadsheet calculates how many CPUs you would need for a different piece of hardware. I've used this tool successfully in Solaris to AIX migrations and have found it to be very accurate. The tool actually saved me hundreds of thousands of dollars as I was able to size my systems doing my own calculations rather than relying on a hardware vendor who wants to sell the biggest box they can.
On the application side, there are several tools that will help you size the type of system that you will need. HP LoadRunner is probably the most well-known tool. It actually simulates user-workloads that allow the architect or capacity planner to test various back-end configurations to determine the right sizing. Open source equivalents of this system; include products like JMeter and The Grinder. Oracle's Hyperion is another tool worth looking at.
It needs to be mentioned how important it is when sizing boxes for future growth, one must fully engage the business and the application to get a better understanding of the client's plans. For example, while your statistics may show a 25% increased utilization, you may not know that the banking division is going to be sold, which means that when purchasing your next systems, you should not go with the statistic alone. You need to know if the client will be rightsizing, downsizing or perhaps brining in new business that may mean that you might need 400% more iron than you think. Perhaps the most compelling reason for doing capacity planning is that if done effectively, it can help avoid costly hardware upgrades through your trending and performance data. Another reason, is that if done properly, you can save your organization big money and be the superstar you know you are.
ABOUT THE AUTHOR: Ken Milberg is a systems consultant with two decades of experience working with Unix and Linux systems. He is a SearchEnterpriseLinux.com Ask the Experts advisor and columnist.
Dig Deeper on Data center capacity planning