This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

top-callgraph userspace profile script


Hi -

ajax noted a need to produce a callgraph-laden profile of some random
userspace program.  A bit of work resulted in the prototype script at
the bottom.


This script uses the profile timer interrupt in order to sample what's
going on at all the CPUs at the HZ rate.  (It can be modified to use a
hardware perf* type event source if your hw/kernel supports this.)

At each sample, it sees if what was interrupted was the userspace
process designated as the target of the systemtap session (as
specified by the stap -c or -x arguments).  If so, it grabs the name
of the currently executing function.

It next checks how frequently this particular function has shown up
lately amongst other samples.  If it's beyond a certain threshold (10%
by default, override with optional stap argument $1), then a certain
fraction of time (15% by default, override with $2), a stack
backtrace is saved.  (This latter number is less than 100% by default
since our current backtrace logic is too slow - see PR10665 - and
could slow down the target process too much.)

Periodically (every 5 seconds), all the saved backtraces are printed,
sorted by the number of occurrences.  Backtraces that reoccur
frequently are printed at the top.  Then the cycle begins anew with
counters etc. reset.

The result is a snapshot of the most frequently seen dynamic call
graphs.  Here is some sample output for a Fedora-13 Xvfb binary,
running glxgears:

[...]
Wed May  5 23:26:36 2010 ticks: 18042, process 8974 hits: 251, missed backtraces: 0
function memcpy, called 161 times, 12 occurrences of this backtrace:
 0x0000003e79886c68 : memcpy+0xe8/0x470 [/lib64/libc.so.6]
 0x000000000041f48a : fbBlt+0xfa/0xd80 [/usr/bin/Xvfb]
 0x0000000000420150 : fbBltStip+0x40/0x50 [/usr/bin/Xvfb]
 0x00000000004213ef : fbPutZImage+0x17f/0x1d0 [/usr/bin/Xvfb]
 0x00000000004ac92c : DamageRegionProcessPending+0x296c/0x2ea0 [/usr/bin/Xvfb]
 0x00000000005015b9 : CloseDownClient+0x25b9/0x4470 [/usr/bin/Xvfb]
 0x000000000050392c : SendErrorToClient+0x3ec/0x1d40 [/usr/bin/Xvfb]
 0x00000000004f424a : fbListInstalledColormaps+0x39a/0x640 [/usr/bin/Xvfb]
 0x0000003e7981ed2d : __libc_start_main+0xfd/0x1d0 [/lib64/libc.so.6]
 0x000000000041a1d9 : _start+0x29/0x100 [/usr/bin/Xvfb]
function memcpy, called 161 times, 9 occurrences of this backtrace:
 0x0000003e79886c64 : memcpy+0xe4/0x470 [/lib64/libc.so.6]
 0x000000000041f48a : fbBlt+0xfa/0xd80 [/usr/bin/Xvfb]
 0x0000000000420150 : fbBltStip+0x40/0x50 [/usr/bin/Xvfb]
 0x00000000004213ef : fbPutZImage+0x17f/0x1d0 [/usr/bin/Xvfb]
 0x00000000004ac92c : DamageRegionProcessPending+0x296c/0x2ea0 [/usr/bin/Xvfb]
 0x00000000005015b9 : CloseDownClient+0x25b9/0x4470 [/usr/bin/Xvfb]
 0x000000000050392c : SendErrorToClient+0x3ec/0x1d40 [/usr/bin/Xvfb]
 0x00000000004f424a : fbListInstalledColormaps+0x39a/0x640 [/usr/bin/Xvfb]
 0x0000003e7981ed2d : __libc_start_main+0xfd/0x1d0 [/lib64/libc.so.6]
 0x000000000041a1d9 : _start+0x29/0x100 [/usr/bin/Xvfb]
function memcpy, called 161 times, 3 occurrences of this backtrace:
 0x0000003e79886c59 : memcpy+0xd9/0x470 [/lib64/libc.so.6]
 0x000000000041f48a : fbBlt+0xfa/0xd80 [/usr/bin/Xvfb]
 0x0000000000420150 : fbBltStip+0x40/0x50 [/usr/bin/Xvfb]
 0x00000000004213ef : fbPutZImage+0x17f/0x1d0 [/usr/bin/Xvfb]
 0x00000000004ac92c : DamageRegionProcessPending+0x296c/0x2ea0 [/usr/bin/Xvfb]
 0x00000000005015b9 : CloseDownClient+0x25b9/0x4470 [/usr/bin/Xvfb]
 0x000000000050392c : SendErrorToClient+0x3ec/0x1d40 [/usr/bin/Xvfb]
 0x00000000004f424a : fbListInstalledColormaps+0x39a/0x640 [/usr/bin/Xvfb]
 0x0000003e7981ed2d : __libc_start_main+0xfd/0x1d0 [/lib64/libc.so.6]
 0x000000000041a1d9 : _start+0x29/0x100 [/usr/bin/Xvfb]
[...]


To run it with git systemtap, install debug data for any shared
libraries or executables you wish to consider, then run:
# stap -g tops.stp -c 'CMD ...' -d /path/to/cmd --ldd
or
# stap -g tops.stp -x PID -d /proc/PID/exe --ldd

If you're running released systemtap 1.2, then instead of --ldd,
you may need to enumerate each shared library possibly loaded by
cmd with something like:
# stap -d /lib/libfoo.so -d /lib/libbar.so [...]

If overload or skipped errors occur, rerun with
# stap -DSTP_NO_OVERLOAD -DTRYLOCKDELAY=20000 [...]

If array overflor errors occur, edit script with larger tops[] /
backtraces[] sizes.

If you'd like to override those percentile parameters, add the numbers
to the end of the command line.
# stap [...] 25 100 # collect 100% of the backtraces of
                    # any function running more than 25% of time

If you get a pass-5 error related to 'Unknown symbol in module'
congratulations, you hit PR11573.  Work around this by running
just once:
# stap -e 'probe process("/proc/PID/exe").function("main") {} probe begin {exit()}'
just to build / load the uprobes.ko module.

If you get backtraces with just addresses and no symbol names,
congratulations, you hit a systemtap runtime bug, PR as yet to be
submitted.  Sorry.  You can likely still use nm by hand to decode the
call graphs.

If you're just the curious type, add throughout
# stap -t -v [...]


- FChE


------------------------------------------------------------------------
// $1: Minimum percentage of profile samples hitting any given function
// before considering backtracing accumulation for it.
global min_function_percent = %( $# >= 1 %? $1 %: 10 %)

// $2: Percentage of hits of that function for which we wish to collect
// backtraces.  
global backtrace_percent = %( $# >= 2 %? $2 %: 15 %)

global total, grandtotal
global tops[10000]
global backtraces[10000]
global missed
global errors


// probefunc() unfortunately ignores user-space IPs, PR11574
function uprobefunc() %{
  _stp_symbol_snprint(THIS->__retvalue, MAXSTRINGLEN, REG_IP(CONTEXT->regs), current, 0);
%}


// data collection
probe perf.sw.cpu_clock!, // or other perf.* event
      timer.profile       // fallback
{ 
  grandtotal ++;
  if (user_mode() && pid()==target()) { // focus on target process only
     total ++
     try { // tolerate array overflows
       pf = uprobefunc()
       x = ++ tops[pf];
       if (x > (total*min_function_percent)/100) {
          // but only some of the time, since backtraces are costly
          if (randint(100) < backtrace_percent) {
             backtraces[pf, ubacktrace()] ++;
          }
       }
     } catch(er) {
       errors[er] ++ // noted errors entries will be reported at shutdown
       missed ++
     }
  }
}


// reporting
probe end,error,timer.s(5) 
{
  printf("\n%s ticks: %d, process %d hits: %d, missed backtraces: %d\n", 
         ctime(gettimeofday_s()), grandtotal, target(), total, missed);
  foreach ([pf,bt] in backtraces-) { 
    printf("function %s, called %d times, %d occurrences of this backtrace:\n", 
           pf, tops[pf], backtraces[pf,bt])
    print_ustack (bt)
  }

  grandtotal=0; total=0; missed=0
  delete tops
  delete backtraces
}


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]