Linux debugging with SystemTap dynamic instrumentation
In this tip, learn how SystemTap’s highly scriptable dynamic instrumentation has an edge over traditional Linux server debugging and performance monitoring.
When analyzing Linux server performance, traditional tools work fine if you simply want to find out what’s happening, but they don't allow you to delve into what really is going on in your system. However, SystemTap offers advanced low-level options to get to the core of the problem.
The essence of SystemTap is that it actually puts a tap on your system. Using these taps helps you to find out what really happens. To do so, SystemTap uses a dynamic instrumentation that works with tracepoints. But what is dynamic instrumentation? To answer that question, you need to know a little bit about the way programs are used in operating systems.
Today’s systems are complicated, and there may be many reasons for performance problems. In many cases, analyzing what a program is trying to do is too simple to find the cause of a performance problem. For instance, a program might use functions that you aren't aware of. That is where dynamic instrumentation can help you -- by integrating into the Linux kernel and tracing what exactly a program is trying to do. Using this approach has an additional benefit: It works without slowing down programs or interrupting the availability of machines. SystemTap is available for most recent Linux distributions. The information in this article is based on Fedora Core 12. If you haven’t installed SystemTap yet, you can install is with the following command:
yum install systemtap kernel-devel yum-utils
debuginfo-install kernel
Differences with traditional debugging methods
To find out exactly what a program is doing, you need a debugger. Traditional debugging techniques have some problems. For instance, an often-used debugger like gdb interrupts normal operation. Sometimes you even need to re-compile or re-install software. The most notable disadvantage of these traditional debuggers is that they only look at one executable or aspect at a time, not at the entire system. And by looking at one aspect only, you might miss important information. These disadvantages don't apply to dynamic instrumentation.
SystemTap isn't just dynamic instrumentation -- it is highly scriptable as well. That allows you to use conditional constructs, associative arrays and much more. Even better, you don't necessarily have to write the scripts yourself, you can use the default scripts that are provided. After installation, you can find some useful scripts in /usr/share/doc/systemtap/examples. Have a look at the index.html to get an overview.
The working of all SystemTap scripts is based on events and probe handlers. The idea is simple to understand: When an event occurs, the handlers produce some action. Below you can see an easy-to-understand example of a SystemTap script. First, the probe function is used to define the event, which in this case is vfs.read (a read action that occurs on the file system). When this event occurs, printf is used to display a message and the script exits.
$ cat simple.stp
probe vfs.read {
printf(“read performed\n)
exit()
}
$stap simple.stp
read performed
$
To do its work, SystemTap uses tracepoints by putting in callbacks in strategic points in the kernel. In version 2.6.34, there are 282 of these tracepoints. (In 2.6.28, there were just 12 of them!)
Some scripts, such as schedtimes.stp, use these tracepoints. This script monitors the scheduler, which is an essential kernel component that determines which task runs where at what moment. The interesting thing about using these tracepoints is that SystemTap talks directly to functionality in the kernel -- in this case, the sched.switch tracepoint, which monitors when there is a switch in the scheduler (which means that the scheduler is paying attention to another process that was waiting to be served). Another important tracepoint related to the scheduler is sched_wakeup. This happens when a process was waiting and gets moved into the queue of runable processes. This allows you to find out how long a process has been sitting in the queue before it could be served, which actually provides important performance data. Run it with stap process/schedtimes.stp, which gives a very nice overview of where each of the processes spends time with regard to the scheduler. Use -c [command] to monitor a specific command. The script starts, the command runs, and the command stops and produces its output, which allows you to see perfectly how the program in question interacts with the scheduler.
SystemTap can provide detailed information on system activity
[root@fedora examples]# stap process/schedtimes.stp
all mode
^C execname: pid run(us) sleep(us) io_wait(us) queued(us) total(us)
events/0: 6 845 5366178 0 2743 5369766
sync_supers: 12 15 3882049 0 20 3882084
bdi-default: 13 20 3331703 0 70 3331793
kblockd/0: 15 12 2881021 0 108 2881141
ata/0: 19 5085 4162266 0 1210 4168561
scsi_eh_1: 37 2783 4165170 0 451 4168404
mpt_poll_0: 247 94 4758588 0 347 4759029
kdmflush: 284 110 2880259 280 213 2880582
jbd2/dm-0-8: 302 755 2880878 0 75 2881708
vmmemctl: 1072 241 4758661 0 133 4759035
vmtoolsd: 1253 4498 5470363 0 1565 5476426
rsyslogd: 1397 1259 5566454 0 792 5568505
hald-addon-inpu: 1557 1034 4107784 0 575 4109393
hald-addon-stor: 1571 2078 4165740 0 932 4168750
hald-addon-stor: 1574 120 4168190 0 449 4168759
Xorg: 1700 84825 3993184 0 31388 4109397
sendmail: 1757 199 4046908 0 9 4047116
rtkit-daemon: 1829 202 5079235 0 98 5079535
Another example: Big Kernel Lock
Big Kernel Lock was introduced in Linux 2.0, during the era in which multiple processors became common in computer systems. The idea was that on a multiple processor system, only one thing could be in the kernel at the same time. This caused scaling problems, and therefore the kernel developers have replaced it with fine-grained locking systems. For performance-analyzing purposes, you might want to know if certain kernel subsystems use Big Kernel Lock and, if so, how bad the rest of the system suffers from that.
Currently, some kernel subsystems still use BKL, like NFS (fixed in RHEL 6), SMB and TTY. SystemTap provides an example script with the name bkl.stp, which shows the number of threads that wait on the kernel lock. If the number of threads is exceeded, it can print the holding threads, including the name of the process, the PID and how long the process held that lock. This is useful information to fix performance problems.
Conclusion
Linux provides some excellent tools to monitor your system. Most of these tools are not capable of giving a generic overview of what is happening on your system. It is particularly difficult to find out how certain programs are interacting with vital kernel parts, such as the scheduler. To find out what exactly is going on, SystemTap is a very useful tool. Because it talks directly to check points that are set in the kernel, SystemTap gives you useful, up-to-date information that my help you to trace why certain programs are causing a performance problem.
ABOUT THE AUTHOR: Sander van Vugt is an author and independent technical trainer, specializing in Linux since 1994. Vugt is also a technical consultant for high-availability (HA) clustering and performance optimization, as well as an expert on SLED 10 administration.
Start the conversation
0 comments