This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Userspace probing


On 05/09/2011 09:19 AM, Lukas Berk wrote:
> Hey Mandar,
> 
> As you already noted there is $syscall variable with a
> 'process("path").syscall' style probe. To check whether it is an
> 'open()' system call, we'd have to compare $syscall to the corresponding
> syscall number (this varies slightly by architecture).  On my system,
> running 'grep __NR_open /usr/include/*/* ' shows 2 and 5 relating to
> SYS_open (which is what we want here). From there we'd just want to
> create conditionals where the $syscall matches.
> 
> Drawing from that, running a script such as:
> $stap -e 'probe process("ping").syscall {
> if($syscall == 2)
> printf("open 2: %s (%d)\n", execname(), pid())
> if($syscall == 5)
> printf("open 5: %s (%d)\n", execname(), pid())
> }' -c 'ping -c 3 google.com'

Hmm, that isn't going to work correctly.  The value of __NR_open varies
between architectures.

# fgrep -w __NR_open linux-2.6-linus/arch/*/include/asm*/*.h
arch/alpha/include/asm/unistd.h:#define __NR_open		 45
arch/arm/include/asm/unistd.h:#define __NR_open			(__NR_SYSCALL_BASE+  5)
arch/avr32/include/asm/unistd.h:#define __NR_open		  5
arch/blackfin/include/asm/unistd.h:#define __NR_open		  5
arch/cris/include/asm/unistd.h:#define __NR_open		  5
arch/frv/include/asm/unistd.h:#define __NR_open		  5
arch/h8300/include/asm/unistd.h:#define __NR_open		  5
arch/ia64/include/asm/unistd.h:#define __NR_open			1028
arch/m32r/include/asm/unistd.h:#define __NR_open		  5
arch/m68k/include/asm/unistd.h:#define __NR_open		  5
arch/microblaze/include/asm/unistd.h:#define __NR_open		5 /* openat */
arch/mips/include/asm/unistd.h:#define __NR_open			(__NR_Linux +   5)
arch/mips/include/asm/unistd.h:#define __NR_open			(__NR_Linux +   2)
arch/mips/include/asm/unistd.h:#define __NR_open			(__NR_Linux +   2)
arch/mn10300/include/asm/unistd.h:#define __NR_open		  5
arch/parisc/include/asm/unistd.h:#define __NR_open
(__NR_Linux + 5)
arch/powerpc/include/asm/unistd.h:#define __NR_open		  5
arch/s390/include/asm/unistd.h:#define __NR_open                 5
arch/sh/include/asm/unistd_32.h:#define __NR_open		  5
arch/sh/include/asm/unistd_64.h:#define __NR_open		  5
arch/sparc/include/asm/unistd.h:#define __NR_open                 5 /*
Common                                      */
arch/x86/include/asm/unistd_32.h:#define __NR_open		  5
arch/x86/include/asm/unistd_64.h:#define __NR_open				2
arch/x86/include/asm/unistd_64.h:__SYSCALL(__NR_open, sys_open)
arch/xtensa/include/asm/unistd.h:#define __NR_open 				  8

So, on most platforms, but not all, 5 is __NR_open.  (For instance, on
ia64, __NR_open is 1024.)  However, 2 is __NR_fork on most platforms.
So you are going to get lots of false positives with the above code.

Here's how to fix this.  To catch the normal case, you can do this:

====
%{
#include <linux/unistd.h>
%}

probe process("ping").syscall {
  if ($syscall == %{ __NR_open %}) {
    printf("open: %s (%d)\n", execname(), pid())
  }
}
====

The above code uses the value of __NR_open that is specific for each
platform to get the right value (so it is always right).  Problem
solved!  Except...

arch/x86/include/asm/unistd_32.h:#define __NR_open  5
arch/x86/include/asm/unistd_64.h:#define __NR_open  2

On 64-bit x86, __NR_open is 2.  But __NR_open is 5 on a 32-bit
executable running on that same 64-bit kernel.

To solve this problem, we've got to know if we're running a 32-bit exe
on the 64-bit kernel.  Here's the code I've used in the past for this,
which adds a function called 'ia32' that lets us know if we're running
an x86 32-bit exe on 64-bit kernel.

====
%{
#include <linux/unistd.h>
%}

%(arch == "x86_64" %?
function ia32:long()
%{ /* pure */
	if (test_tsk_thread_flag(current, TIF_IA32))
		THIS->__retvalue = 1;
	else
		THIS->__retvalue = 0;
%}
%)

probe process("ping").syscall {
  if ($syscall == %{ __NR_open %}
%(arch == "x86_64" %?
      || (ia32() && $syscall == 5)
%)
      ) {
    printf("open: %s (%d)\n", execname(), pid())
  }
}
====

Unfortunately we've got to hardcode the 5 here, since __NR_open will be
2 on the 64-bit x86_64 kernel.

> would return only the open() syscalls, feel free to change the segments
> following the if()'s however you want.
> 
> Another method would be to probe via syscall.open and filter by
> execname() or target().
> 
> Using a similar example to above you could write a script such as:
> 
> stap -e 'probe syscall.open {
> if(execname() == "ping")
> printf("pid: %d\n", pid())
> }' -c 'ping -c 3 google.com'

That code looks fine.

So, which code to pick? It depends on what your application is and what
else is running on your system.  The 'process.syscall' probe is going to
hit for every syscall in your application, but won't slow down any other
process in the system.  The 'syscall.open' probe will only hit for open
syscalls, but will hit on every open syscall on every running process.

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]