Documentation/devtools/maemo5/strace

kaltsi (Talk | contribs)
(New page: == Description == Strace is a handy tool for seeing how an application is interacting with the operating system using the system calls. It displays the used system calls and parameters. I...)
Newer edit →

Revision as of 08:06, 2 October 2009

Contents

Description

Strace is a handy tool for seeing how an application is interacting with the operating system using the system calls. It displays the used system calls and parameters. It also shows the return codes of the system calls. In addition strace can attach to a running process making it a suitable tool for debugging the system calls that an already running process is calling.

Strace is especially helpful for:

  • Figuring out why the program is failing to start or exits early.
  • Finding out what (configuration) files are used by the program.
  • Checking out whether the program is idle or wasting battery when it should not be doing anything.
  • Finding out where a frozen program got stuck.
  • Viewing an application's communications with the network.


Packages

source: strace

binary: strace

Usage Examples

Installing Strace

The strace tool can be installed to the N900. In Scratchbox it is possible to use the Scratchbox-provided strace. Using strace in an X86 target rather than in an ARMEL target is preferable for the reason that in an ARMEL target strace will actually trace the system calls of the QEMU process (which is used to run the arm binary).

Using Strace

These examples can be run both in Scratchbox X86 environment and in the N900 device itself.

Getting a System Call Trace from an Application

To get a simple list of the system calls the application is calling, strace is run with the path to the program as a parameter. Below strace is run with the small gdb_example program that is introduced in the [maemo debugging guide][].

[sbox-FREMANTLE_X86: ~/src/testing/gdb_example] > strace ./gdb_example
execve("./gdb_example", ["./gdb_example"], [/* 50 vars */]) = 0
brk(0)                                  = 0x804a000
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4001c000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/lib/tls/i686/sse2/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/lib/tls/i686/sse2", 0xbff86948) = -1 ENOENT (No such file or directory)
open("/lib/tls/i686/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/lib/tls/i686", 0xbff86948)     = -1 ENOENT (No such file or directory)
open("/lib/tls/sse2/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/lib/tls/sse2", 0xbff86948)     = -1 ENOENT (No such file or directory)
open("/lib/tls/libc.so.6", O_RDONLY)    = -1 ENOENT (No such file or directory)
stat64("/lib/tls", 0xbff86948)          = -1 ENOENT (No such file or directory)
open("/lib/i686/sse2/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/lib/i686/sse2", 0xbff86948)    = -1 ENOENT (No such file or directory)
open("/lib/i686/libc.so.6", O_RDONLY)   = -1 ENOENT (No such file or directory)
stat64("/lib/i686", 0xbff86948)         = -1 ENOENT (No such file or directory)
open("/lib/sse2/libc.so.6", O_RDONLY)   = -1 ENOENT (No such file or directory)
stat64("/lib/sse2", 0xbff86948)         = -1 ENOENT (No such file or directory)
open("/lib/libc.so.6", O_RDONLY)        = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0 Y\1\000"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1213256, ...}) = 0
mmap2(NULL, 1222300, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x4001d000
mmap2(0x40142000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x124) = 0x40142000
mmap2(0x40145000, 9884, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40145000
close(3)                                = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40148000
set_thread_area({entry_number:-1 -> 6, base_addr:0x401486c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
mprotect(0x40142000, 4096, PROT_READ)   = 0
fstat64(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 2), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40149000
write(1, "Now in main().\n", 15Now in main().
)        = 15
write(1, "Now in example_1(1) function.\n", 30Now in example_1(1) function.
) = 30
write(1, "Now in example_2(parameter here)"..., 43Now in example_2(parameter here) function.
) = 43
write(1, "Now in example_3()\n", 19Now in example_3()
)    = 19
exit_group(0)                           = ?
[sbox-FREMANTLE_X86: ~/src/testing/gdb_example] > 

N.B.

If you are using strace in an ARMEL target, the option -f needs to be given, so that the process run under the QEMU is also traced.

Essentially, one line in the output refers to one system call made to the Linux kernel. Every line has the name of the system call, the parameters to the call, and finally the return value of the system call.

For example, the line:

write(1, "Now in main().\n", 15Now in main().
)        = 15

tells that the system call that was called from gdb_example was write() with parameters (1,"Now in main().\n", 15), next the output of the call is shown: "Now in main()." and finally the return code is displayed: =15.

Note: Failed system calls usually (not always) return a return code of -1.

An example of a failed stat64 call would be:

stat64("/lib/sse2", 0xbff86948)         = -1 ENOENT (No such file or directory)

Here the stat64 return value of -1 is displayed together with the symbolic name of ENOENT error code. You can see clearly that the program tried to access /lib/sse2, but failed for the reason that the file does not exist.

Studying the output of strace and comparing it to the source file tells that, for example, the gdb_example program's printf() functions are each calling the write() system call. In addition, it shows all the open() and stat64() and other system calls in the list.

Attaching to a Running Process with Strace

One can attach to a running process with strace. This is helpful, especially when debugging a daemon type of an application, and it is necessary to see what system calls the daemon is calling without restarting the daemon.

In the following example, the top application is started inside Scratchbox and left running in one terminal. Then another terminal will be opened and strace is attached to the top application.

1. Start the top application inside one terminal in Scratchbox X86 target:

[sbox-FREMANTLE_X86: ~] > top

## the screen clears and top starts to loop...

top - 14:49:35 up 17 days, 23:38,  0 users,  load average: 0.07, 0.02, 0.00
Tasks: 115 total,   2 running, 113 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0% us,  0.0% sy,  0.0% ni,  0.0% id,  0.0% wa,  0.0% hi,  0.0% si
Mem:    775176k total,   754708k used,    20468k free,   126608k buffers
Swap:  1622524k total,    19716k used,  1602808k free,   413752k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
12807 user      16   0  2184 1172  888 R 79.0  0.2   0:00.09 top
    1 root      16   0  1632  536  448 S  0.0  0.1   0:01.56 init
    2 root      34  19     0    0    0 R  0.0  0.0   0:00.08 ksoftirqd/0
    3 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 watchdog/0 
    4 root      10  -5     0    0    0 S  0.0  0.0   0:00.18 events/0 
    5 root      10  -5     0    0    0 S  0.0  0.0   0:00.01 khelper
    6 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 kthread
    8 root      10  -5     0    0    0 S  0.0  0.0   0:00.11 kblockd/0
    9 root      10  -5     0    0    0 S  0.0  0.0   0:00.05 kacpid
 ... etc ...

1. Start another terminal window, log in on the Scratchbox and run the following commands:

[sbox-FREMANTLE_X86: ~] > pidof top
12807
[sbox-FREMANTLE_X86: ~] > strace -p 12807

With the -p flag strace can be attached to a running process.

The pidof command tells the process id (PID) of a running program. In this example, it is 12807 (in your machine the number will be different). The output of strace would look similar to this:

Process 12807 attached - interrupt to quit
select(1, [0], NULL, NULL, {2, 320000}) = 0 (Timeout)
fcntl64(0, F_SETFL, O_RDWR|O_NONBLOCK)  = 0
read(0, 0xbfeb191f, 1)                  = -1 EAGAIN (Resource temporarily unavailable)
ioctl(0, TCFLSH, 0)                     = 0
fcntl64(0, F_SETFL, O_RDWR)             = 0
gettimeofday({1195023907, 63212}, {4294967176, 0}) = 0
open("/proc", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 7
fstat64(7, {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
fcntl64(7, F_SETFD, FD_CLOEXEC)         = 0
getdents64(7, /* 35 entries */, 1024)   = 1008
getdents64(7, /* 37 entries */, 1024)   = 1008
stat64("/proc/1", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
open("/proc/1/stat", O_RDONLY)          = 8
read(8, "1 (init) S 0 1 1 0 -1 4194560 43"..., 1023) = 187
close(8)                                = 0
open("/proc/1/statm", O_RDONLY)         = 8
read(8, "738 463 133 21 0 361 0\n", 1023) = 23
close(8)                                = 0
stat64("/proc/2", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
open("/proc/2/stat", O_RDONLY)          = 8
read(8, "2 (kthreadd) S 0 0 0 0 -1 41024 "..., 1023) = 128
close(8)                                = 0
open("/proc/2/statm", O_RDONLY)         = 8
read(8, "0 0 0 0 0 0 0\n", 1023)        = 14
close(8)                                = 0
stat64("/proc/3", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
open("/proc/3/stat", O_RDONLY)          = 8
read(8, "3 (migration/0) S 2 0 0 0 -1 410"..., 1023) = 135
close(8)                                = 0

... long list of output ...

Studying the output of strace, you can see what system calls the top program is calling during its main loop. In this case, top is opening a lot of files under the /proc/ file system to get information about the running processes in the system.

How to Trace All the Running Applications

Occasionally it may be necessary to trace all the running applications in the device. For example, it may be necessary to know which process accesses some specific file, while not keeping it open all the time (if it is kept open all the time, it can be found from /proc/*/fd/ or with lsof).

In the following example from the device, the open() system calls are traced from all running programs started after the DSME process.

/home/user # strace -e trace=open \
      $(cd /proc/; dsme=$(pidof dsme|cut -d' ' -f1); \
      for pid in [0-9]*; do \
         if [ $pid -gt $dsme ] && [ $pid != $$ ]; then \
             echo "-p $pid"; \
         fi; \
      done) 

Note: The above command does not start to trace the DSME. Accidentally suspending DSME would cause the device to reboot because DSME keeps the device's HW watchdog refreshed.

The output of the command will look similar to this:

/home/user # strace -e trace=open \
>       $(cd /proc/; dsme=$(pidof dsme|cut -d' ' -f1); \
>       for pid in [0-9]*; do \
>          if [ $pid -gt $dsme ] && [ $pid != $$ ]; then \
>              echo "-p $pid"; \
>          fi; \
>       done) 
Process 1019 attached - interrupt to quit
Process 1036 attached - interrupt to quit
Process 1039 attached - interrupt to quit
...snip...
Process 13559 attached - interrupt to quit
attach: ptrace(PTRACE_ATTACH, ...): No such process
Process 332 attached - interrupt to quit
Process 3606 attached - interrupt to quit
attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted
Process 434 attached - interrupt to quit
Process 682 attached - interrupt to quit
...snip...
Process 332 detached
[pid   762] open("/dev/dspctl/ctl", O_RDWR) = 6
[pid   762] open("/dev/dspctl/ctl", O_RDWR) = 6
[pid   762] open("/dev/dspctl/ctl", O_RDWR) = 6
[pid   709] --- SIGALRM (Alarm clock) @ 0 (0) ---
[pid   709] --- SIGALRM (Alarm clock) @ 0 (0) ---
[pid   762] open("/dev/dspctl/ctl", O_RDWR) = 6
[pid   709] --- SIGALRM (Alarm clock) @ 0 (0) ---
... etc ...
          

The output shows which processes strace attaches to and what are the files those processes are opening.

N.B.

If there are lots of system calls performed by the process(es), strace might not show all of them. So, basically the output in this kind of a situation is only indicative. For more accurate information, a specific single processes can be traced.

Other Strace Options

Here are examples of some of the useful strace options.

  • -f fork
The -f fork option tells strace to follow the execution to the forked process from the calling process. By default, strace does not follow forks.
  • -e expr
The -e option is handy when it is necessary to limit the scope of the output. For example,
[sbox-FREMANTLE_X86: ~/src/testing/gdb_example] > strace -e trace=network ./gdb_example
would only output those calls that are network-related. In this example, there would be no other output than the normal output of the program.
However,
[sbox-FREMANTLE_X86: ~/src/testing/gdb_example] > strace -e trace=file ./gdb_example
would only show which system calls are called related to accessing files.
  • -tt
The -tt option makes strace precede each system call with the time it was called. This could potentially be used for rough performance analysis.
Start a process and trace all IPC and network related system calls it makes and print the time of day with microseconds for each traced system call.
[sbox-FREMANTLE_X86: ~/] > strace -tt -e trace=ipc,network ./some_program -programs -own -args

Links

[strace man page](/development/documentation/man_pages/strace.html)

[maemo debugging guide][]

[maemo debugging guide]: /development/documentation/how-tos/4-x/maemo_debugging_guide.html

See Also

ltrace