What are some common misconceptions about the relationship between capacity and throughput?
Steve Shah: First and foremost, the assumption is that they are the same. They are not. Capacity measures the amount of something you can simultaneously process without falling over. This could be processes, connections, database requests. Throughput measures the amount of something you can process per unit time. For example, new processes/sec, connections/sec. While the two are related, their relationship is not necessarily fixed, as there could be other activity that impacts one or the other. For example, a cron job could be consuming a lot of memory that reduces the maximum number of concurrent connections the Web server can process. But the cron job is not heavily impacting CPU, so throughput of the Web server may remain the same.
Of course, systems vary, but, in general, why doesn't the highest capacity produce the highest throughput?
Shah: This is usually because the efficiency of the algorithms driving the system isn't constant across the board. Many of us using Linux may recall from pre-O(1) scheduler days that the efficiency of switching between processes decreased as the number of processes grew. Eventually, the system may have been able to work with 300 processes, but each process wasn't doing much.
For most applications, it is prudent to optimize server usage for peak throughput. That is, you want to turn over as many requests on the server per second as possible so that you can get new requests in. This means that the time to start processing a new request may be a little bit after it arrives rather than immediately. However, the request will take less time to complete and thus the total time (from arrival to complete) will be less than if the server started to immediately process the request.
It is important to keep these numbers in mind when doing capacity planning since you'll want to invest in new servers/software and time, space and power at less than advertised peak capacity.
What are the common barriers to getting peak throughput on Linux?
Shah:Every application has a different barrier. Simple Web-based applications, for example, tend to reach a barrier on the maximum number of concurrent connections. There reaches a point when the penalty of connection table lookups start eating away at a lot more time on a per packet basis. Another common barrier is memory, which is, thankfully, solved quite easily.
More often than not, performance problems are a complex mix of activities -- application performance, poorly written software, bogged down hardware, databases. In one situation I recently dealt with, the problem turned out to the rooted with the application requiring 140 calls to the database for a particular operation. This soaked up a lot of resources on the database, which impacted the other application servers that were also using it. During those queries, the application server also became low on memory, but was largely sitting idle. [This] made it turn over a lot of existing connections quickly and thus the load balancer gave it more traffic to contend with, which only aggravated the problem.
What other types of mistakes do IT managers make when analyzing capacity?
Shah:The second most common mistake is the assumption that performance is linear. In most cases, performance is not linear, but rather it follows a polynomial.
Finally, and this is a real pet peeve of mine, the assumption that more threads means faster. No! No! No! Threading is very application specific -- some applications do very well with more threads and are only limited by the operating system's ability to handle a lot of them. Other applications have dependencies between the threads, which means that locking can quickly make threads go idle and more threads means just more idle threads. Some applications don't even need threads -- they follow a run-to-completion model that keeps them optimally busy.
Apache is a frequent victim of too many threads; it follows a performance curve just as any application does. And depending on what you're doing with Apache, you may find that only a handful of threads makes sense.
How would those mistakes mess up capacity planning?
Shah: Administrators could find themselves under-planning. If they think that their server can handle 1,000 users, when in reality it doesn't make sense to load it with more than around 100, they could find themselves with an underperforming application.
What makes capacity planning different, easier or harder on Linux than other operating systems?
Shah: Linux capacity planning is different than other operating systems simply because it [has] a different set of performance behaviors. Administrators should evaluate [their] particular application under various workloads to determine the sweet spot.
What's the ideal scenario for achieving peak capacity and peak throughput on Linux?
Shah: Linux makes great use of extra memory. For example, the kernel maintains a buffer cache that when hit often, allows the system to perform well. If your application calls on the disk often, it may be prudent to try and optimize it with enough memory and find a way to distribute workloads so that each server tends to do the same thing as much as possible.
On the networking side, the kernel does remarkably well, but not all applications do. Scale carefully here.
Definitely make use of the new features for performance on the newer kernels -- like the O(1) scheduler -- to give a little edge to your performance.