Growing a mainframe is different from scaling other back-end IT platforms. With distributed compute, if capacity...
planners find some servers reaching their limit, they'll add a few more into the racks and go drink coffee.
Mainframe capacity planning, however, is more careful and disciplined because of the money involved. Turning on another processor, or even increasing CPU caps on a logical partition (LPAR), can have huge consequences in hardware and software costs on the mainframe.
IBM mainframes have tools like the System Management Facility and Resource Management Facility that account for nearly every consumed cycle during mainframe usage. IBM also eases the upgrade path of big iron by shipping Central Processing Complexes with full complements of processors and microcode controlling which ones the IT organization can use. This configuration allows authorized IT shops to execute a Customer Initiated Upgrade with just a microcode update -- and a check to IBM. Capacity on Demand is available for permanent upgrades and On/Off Capacity on Demand (OOCoD) suits temporary ones. IT organizations that pursue dynamic mainframe capacity planning must sign a proper contract with IBM and have installed capacity upgrade records.
The next step in z/OS capacity planning
Capacity Provisioning Manager (CPM), available in z/OS 1.9 and later, takes the dynamic tuning idea a step further by allowing for automatic addition or deletion of mainframe capacity depending on workload performance. With CPM, installations can define a policy by which the provisioning manager might invoke OOCoD if a workload isn't performing well. CPM can remove the capacity when demand returns to normal.
CPM makes these decisions thanks to its interface with z/OS' Workload Manager (WLM). Through WLM, CPM monitors workloads and their success in meeting performance goals. However, if a workload is missing its mark, CPM doesn't merely look at CPU. It considers all the factors that could delay a workload and might not invoke OOCoD if it determines that extra capacity won't help.
CPM's policy rules have a lot of granularity. Users can define rules for different workloads at different times of the day with thresholds for when to turn on extra capacity, along with capacity minimums and maximums. Perhaps even more importantly, some rules control when capacity should be taken away after the rush is over.
CPM manages capacity by millions of service units (MSUs) for regular processors. Exactly how it does this depends on the mainframe users' setup and implementation of soft or hard caps. For specialty processor capacity, CPM varies zIIPs and zAAPs online or offline instead of MSUs.
CPM extends to hardware
The release of z/OS 2.2 brings Capacity Provisioning to the hardware level by adding the capability to monitor processors busy across a CPC by processor type. When CPU utilization exceeds a certain threshold, CPM will bring more engines online to the LPARs that need them. When the workload crisis is over, Capacity Provisioning Manager takes the engines away.
For this engine-assist functionality, CPM policies group CPC logical names, as defined in the box's support element, into provisioning domains. Within the policy the user defines utilization conditions, which set thresholds and times for adding or subtracting capacity. Utilization conditions cannot be used for managing defined and group capacity. Nor do they take into account bottlenecks other than CPU usage.
Dynamic capacity at the CPC level does not involve WLM or the ability for workloads to meet their goals. Because CPM is only concerned about processor busy, it's possible that unimportant workloads, such as batch, may drive CPC utilization high enough to invoke an increase even if that's not what the user wanted. However, this is the tradeoff since IBM designed this scheme to meet capacity spikes much more aggressively than with past mainframe technologies.
Is CPM a good thing?
CPM is an elegant solution to the problem of managing highly variable workloads. Platforms without this capability either keep around potentially unused excess capacity or are planned far enough ahead to provide processing when needed.
The flip side of CPM comes down to, as it always seems to do with mainframes, cost. Most software vendors, IBM included, base prices on processing capacity. Therefore, the first step in dynamic mainframe capacity planning is securing software vendor contracts that won't hurt when CPM adds horsepower. The capacity planner must also take into account that, for some software, a five-minute spike in the middle of the night may represent a peak that will set the price paid for the whole month.
It all comes down to planning.
What you need to know about thread safety
Are mainframes the most secure servers in town?
What to do when CPU-constrained mainframes hit bottlenecks
IBM puts hardware compression on z systems