Documentation/devtools/maemo5/oprofile
This article documents a developer tool. A list of available devtools is available, together with installation instructions. |
Contents |
[edit] Description
Oprofile is a low overhead system-wide profiler for linux. It can be used to find CPU usage bottlenecks in the whole system and within processes.
[edit] Packages
source: oprofile
binary: oprofile
[edit] Installing Oprofile
[edit] Configuring the device
In order to run oprofile on your device an extra module will need to be installed. The oprofile kernel module is found in the kernel-modules-debug package, that can be installed from the tools repository.
[edit] Installing oprofile to the device
Provided that you have the Fremantle tools repository in your APT sources.list, the easiest way to install oprofile and the required kernel module is using apt.
Nokia-N900-40-12:~# apt-get install oprofile kernel-modules-debug
This will also install binutils.
[edit] Installing debug symbols
In order to view any useful profiling information at functions level, you will have to install debugging symbols. Debugging symbols normallycome with debugging (-dbg) packages. The easiest way to install all dbg packages required for a given binary is to use debug-dep-install script which comes with the maemo-debug-scripts package:
Nokia-N900-40-12:~# apt-get install maemo-debug-scripts Nokia-N900-40-12:~# debug-dep-install /usr/bin/osso-xterm.launch
[edit] Usage
1. On the device, type:
Nokia-N900-40-12:~# insmod /lib/modules/current/oprofile.ko Nokia-N900-40-12:~# opcontrol --no-vmlinux Nokia-N900-40-12:~# opcontrol --separate=kernel Nokia-N900-40-12:~# opcontrol -c 8 Nokia-N900-40-12:~# opcontrol --init
Like with the --separate=library option, the --separate=kernel option separates the collected statistics per process and their components. In most use-cases cases processes (implicitly) request other processes like X server and hildon-desktop to do work for them. To optimize the CPU usage, you need to see which processes need to use most CPU and in which of its components (binary/libraries) in the whole system. The --separate=kernel option will additionally assign CPU usage within kernel under the processes that caused it. The vmlinux binary name is used for this part. The -c 8 option will make oprofile collect information about the call graph, till the depth of 8 function calls.
2. Start the usecase you are interested in and type:
Nokia-N900-40-12:~# opcontrol --reset Nokia-N900-40-12:~# opcontrol --start
3. When you've finished, type:
Nokia-N900-40-12:~# opcontrol --stop
Now you've collected the data.
[edit] Viewing profile reports
To see basic per-process picture, type opreport:
Nokia-N900-40-12:~# opreport CPU: OMAP GPTIMER, speed 0 MHz (estimated) Counted GPTIMER_CYCLES events (32KiHz timer clock cycles between interrupts) with a unit mask of 0x00 (No unit mask) count 16 GPTIMER_CYCLES:16| samples| %| ------------------ 43666 88.5972 no-vmlinux 2636 5.3484 maemo-launcher GPTIMER_CYCLES:16| samples| %| ------------------ 491 18.6267 no-vmlinux 450 17.0713 libclutter-eglx-0.8.so.0.800.2 410 15.5539 libgobject-2.0.so.0.2000.3 342 12.9742 libGLESv2.so 275 10.4325 libglib-2.0.so.0.2000.3 138 5.2352 libpthread-2.5.so 134 5.0835 libc-2.5.so 55 2.0865 libdbus-1.so.3.4.0 50 1.8968 hildon-desktop.launch 45 1.7071 libpango-1.0.so.0.2400.2 42 1.5933 libgdk-x11-2.0.so.0.1400.7 32 1.2140 libX11.so.6.2.0 32 1.2140 libpulsecommon-0.9.15.so 28 1.0622 libgtk-x11-2.0.so.0.1400.7 27 1.0243 libgio-2.0.so.0.2000.3 ...
After you know which processes and components are taking most of CPU, you need to find out the bottleneck functions/functionality in them. For this you need to install debug symbols for them.
Note: If with the --separate=kernel option there's a lot of CPU activity for kernel that's not assigned under any process, it means that the system/kernel is idle. If your use-case is (unexpectedly) slow despite system idling a lot, usually these kind of issues are related to locking and other inter-process interaction issues that cannot be analyzed by looking at the CPU usage.
To see more detailed symbol analysis use opreport -l:
Nokia-N900-40-12:~# opreport -l /usr/bin/Xorg | more warning: /no-vmlinux could not be found. CPU: OMAP GPTIMER, speed 0 MHz (estimated) Counted GPTIMER_CYCLES events (32KiHz timer clock cycles between interrupts) with a unit mask of 0x00 (No unit mask) count 16 samples % image name symbol name 313 51.7355 no-vmlinux /no-vmlinux 153 25.2893 Xorg /usr/bin/Xorg 36 5.9504 libpixman-1.so.0.15.3 /usr/lib/libpixman-1.so.0.15.3 31 5.1240 libc-2.5.so /lib/libc-2.5.so 11 1.8182 libexa.so /usr/lib/xorg/modules/libexa.so 10 1.6529 fbdev_drv.so /usr/lib/xorg/modules/drivers/fbdev_drv.so 10 1.6529 librecord.so /usr/lib/xorg/modules/extensions/librecord.so 8 1.3223 libsrv_um.so /usr/lib/libsrv_um.so 7 1.1570 libdbus-1.so.3.4.0 /usr/lib/libdbus-1.so.3.4.0 7 1.1570 libpthread-2.5.so /lib/libpthread-2.5.so 6 0.9917 libfb.so /usr/lib/xorg/modules/libfb.so 5 0.8264 libdri2.so /usr/lib/xorg/modules/extensions/libdri2.so 5 0.8264 librt-2.5.so /lib/librt-2.5.so 2 0.3306 libpvr2d.so /usr/lib/libpvr2d.so 1 0.1653 evdev_drv.so /usr/lib/xorg/modules/input/evdev_drv.so
Once you know what functionality is a bottleneck, you need to find out whether your process should be (indirectly) causing the use of that functionality in the first place, is it using it too much/often or should the bottleneck functionality itself be optimized. Analysis of this falls to the corresponding process developers as only they know what their application is trying to achieve, why & how and before this kind analysis it's too early to assign/report bugs for lower level components.
[edit] Profiling with callgraphs
If you have initialized opcontrol with the -c option as described before, you should now be able to get call graphs for your applications. The textual information reported by opreport in these cases is a bit difficult to read, but there are ways to generate nice graphs out of them:
Nokia-N900-40-12:~# opreport -l /usr/bin/Xorg -c > oprofile.log #...copy the oprofile.log to your PC... myPC$ cat oprofile.log | python gprof2dot.py -f oprofile | dot -Tpng -o callgraph.png
You need the gprof2dot.py script and the dot tool, which is part of the Graphviz software (which is in Ubuntu's graphviz package).
[edit] Viewing reports from a PC
opreport -l, and especially opreport -c -l can take quite a long time when fired up on the devices. Therefore, it often makes sense to run opreport in scratchbox.
- Configure scratchbox target in a way that its binaries and libraries 100% match the target's.
- Collect profiling data as usual
- Copy contents of /var/lib/oprofile from the device to the corresponding directory in scratchbox target.
- in scratchbox, apt-get install maemo-debug-scripts (this may not be omitted)
- install debug packages either with debug-dep-install or by hand
Note: the binaries and libraries in the scratchbox target must match what's in the device, otherwise you will get bogus results.
[edit] Oprofile with kcachegrind
kcachegrind is a useful GUI tool for viewing performance data interactively. It comes with many modern linux distros.
To use it:
- Get the callgraph oprofile data (see above) and install the same packages also to scratchbox.
- Copy the profile data to scratchbox session as described above.
- install kcachegrind-converters package on HOST (debian, ubuntu)
- in scratchbox: opreport -gdf | op2calltree (you might want to copy op2calltree script somewhere on target)
- the resulting files can now be opened with kcachegrind on host, provided you set it to display ALL files (extensions are wrong)
[edit] Links
- [oprofile man page](/development/documentation/man_pages/oprofile.html)
- http://oprofile.sourceforge.net/about/
- http://oprofile.sourceforge.net/doc/controlling.html
- http://kcachegrind.sourceforge.net/cgi-bin/show.cgi
[edit] See Also
- This page was last modified on 16 June 2010, at 10:47.
- This page has been accessed 18,203 times.