UNIX system performance monitoring involves more than just keeping an eye on the amount of free space or the number...
of users who are logged in. One can into various files to find out anything about the system.
Some of the resources that need to be monitored very regularly are CPU power, bandwidth, memory and storage. It is important to note here that these resources will have a direct influence on the performance of the system.
The systems you monitor might fall into one of two categories: The system is having some performance problems and you are interested in improving its performance; The system is perfectly all right as far as performance is concerned and you want to maintain its performance at that acceptable level.
The process of monitoring usually consists of the following three phases.
- Monitor the system in order to understand the resources that are causing performance problems.
- The data collected in step 1 is analyzed thoroughly and one will come up with a remedy (in order to improve the performance of the system)
- Once again monitor the system to make sure that the problem has been rectified.
And in most cases this monitoring process will be iterative. That is, the steps listed above will be repeated several times and with this one will be able to achieve the best performance from the system under consideration.
Performance monitoring in HP UNIX operating system
As mentioned earlier, the resources present in any system are mainly the CPU power, bandwidth, memory and storage. Let us try to understand more on what all needs to be monitored as far as these resources are concerned.
If the CPU utilization stays well below 100%, it is clear that additional processing power is available for other tasks. It is important to understand the following CPU utilization statistics in order to gain maximum benefits while monitoring the CPU power
Context switches: A context switch happens when CPU stops running one process and starts executing another process. As each context switch needs the OS to take control of the CPU, excessive context switches mean high level of CPU utilization.
Runnable processes: A process usually exists in several states. A typical state is the one in which a process may be waiting for an I/O operation to complete. In such a scenario, the process is not in need of CPU. After a certain amount of time, the state of the process changes and it becomes ready to run. When we have more than one runnable process, all but one process needs to wait for their turn. Thus the number of runnable processes is considered as an important statistic while determining the CPU usage of the system.
Access to CPU time: When we have more processes active on a system, each individual process gets less CPU time. And to perform one task we require more time
Network throughput: This is the amount of data, usually measured in bits per second, which travels to or from the system. To determine whether network throughput is the bottleneck, one needs to gather the data to find out the maximum throughput of the system. The "netstat" command is widely used to find out variety of network information.
A way of monitoring the usage on the network bandwidth is to use a network protocol called SNMP, or Simple Network Management Protocol. SNMP is generally used to manage network devices from a central management station, and while it is usually a service provided on devices such as routers or hubs, it can also be used with PCs that have an SNMP service running. SNMP does not necessarily need to be used just for management and can be used to gather interface statistics for a device that allows it.
Memory usage: The workload determines the amount of real and virtual memory required for a system. In order to meet the needs to the system workload, the amount of memory may be increased. One needs to notice here that constant use of virtual memory degrades the performance of the system.
This is a significant area where a wealth of performance statistics can be found. There are several memory utilization parameters. A system administrator needs to concentrate on these parameters while monitoring a system. An important memory utilization parameter is related to "Free, Shared, Buffered and Cached Pages". One will be able to determine the overall mix of memory utilization by using this parameter.
Monitoring the storage can takes place at two different levels: Sufficient disk space monitoring; and monitoring for storage related performance issues.
The following table describes some of the popular UNIX utilities that can be used for monitoring the performance of applications. These are available on HP UNIX Operating System.
|Utility||A brief description|
|Vmstat||This utility displays virtual memory statistics.|
|Iostat||The iostat utility displays I/O utilization statistics for all active disks on the system.|
|Sar||This is a very popular UNIX utility and this produces system activity reports of CPU, memory and disk usage.|
|Ps||This displays information about process states.|
|Netstat||With this, one can get statistics about the network connections.|
|Top||This gives information about the active processes that are consuming the most CPU.|
|Getconf||This displays maximum allowed value of various system configuration parameters. This will be very useful if you run into some problem and think that you might have exceeded some predefined limit.|
|Quota||This utility is used for examining the disk usage. It shows how your disk usage compares with your permitted maximum. Your disk usage may exceed your "quota" for a short "grace period" (typically a week), but it may never exceed your "limit". If the grace period runs out, you will have to delete files before you can create any more. You may have to type "quota –v" to get full information.|
Performance monitoring will involve analyzing data to locate the major bottlenecks. Once you identify a bottleneck, the next step is to take corrective measures in order to improve the performance. As mentioned above, system monitoring is an iterative process and needs to be done on a daily basis. Thus a system administrator needs to make sure that all the systems are operating at optimum performance levels. It is not a bad idea to generate reports on the health of systems periodically.
ABOUT THE AUTHOR: Swayam Prakasha has been working in information technology for several years, concentrating on operating systems, network security and Internet services. Prakasha has authored a number of articles for trade publications, and he presents papers at industry conferences.Currently he works at Unisys Bangalore in Open Source Group.