This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Ltt-dev] patches to actually use markers?


Mathieu Desnoyers wrote:
> * David Smith (dsmith@redhat.com) wrote:
>> Mathieu Desnoyers wrote:
>>> * David Smith (dsmith@redhat.com) wrote:
>>>>>> Mathieu
>>>> I've been looking at your system call tracing patches.  (I've tried
>>>> running lttv itself without much luck, but it doesn't really matter for
>>>> the sake of this discussion.)
>>>>
>>>> I like the way you use the existing system call tracing points.  So
>>>> we're on the same page, here are the markers I'm seeing in
>>>> arch/x86/kernel/ptrace32.c after applying
>>>> patch-2.6.24-rc2-lttng-0.10-pre23.tar.bz2:
>>>>
>>>>   trace_mark(kernel_arch_syscall_entry, "syscall_id %d ip #p%ld",
>>>> 			(int)regs->orig_eax, instruction_pointer(regs));
>>>>
>>>>   trace_mark(kernel_arch_syscall_exit, MARK_NOARGS);
>>>>
>>>> For systemtap use, we'd like to have more information than that.  On
>>>> syscall entry, we'd like be able to get the arguments,  On syscall exit,
>>>> we'd like the to be able to get the return value.  In fact, the easiest
>>>> thing would be to supply the same information that audit_syscall_entry()
>>>> and audit_syscall_exit() need.
>>>>
>>>> Since I'll bet you've already considered this, I'd like to know why you
>>>> decided to go a different way.
>>>>
>>> Well, the approach taken was to instrument each important system call in
>>> the syscall specific function to be able to actually know what type of
>>> information to record. For instance, if ebx points to a string, the
>>> pointer is not very useful, but the string is.
>> That is (somewhat) true in the case of strings.
>>
>> But, similar problems exist with syscalls that take structure pointers:
>> sys_[gs]ettimeofday, sys_adjtimex, sys_times, sys_nanosleep,
>> sys_[gs]etitimer, sys_timer_create, sys_timer_[gs]ettime,
>> sys_clock_gettime, sys_clock_getres, sys_clock_nanosleep,
>> sys_sched_setscheduler, sys_sched_[gs]etparam, sys_wait4, sys_waitid,
>> sys_rt_sigtimedwait, sys_stat, sys_statfs[64], sys_fstatfs[64],
>> sys_lstat, sys_fstat, and so on (I got tired of looking through syscalls.h).
>>
>> For those syscalls only a pointer can be passed so the marker handler
>> will have to know how to handle that pointer.  That marker handler will
>> need to know that that value is a pointer to a particular structure type
>> and then know how to access it accordingly.
>>
>> The same could be done for strings.  Is it a little more work?  Yes.  Is
>> it fairly easy?  Yes.
>>
>> Let me ask the question another way.  Is there a (measurable)
>> performance hit if the extra arguments to the syscall entry marker are
>> added?  If not, even if lttng doesn't plan to use them, why not add
>> them?  Certainly systemtap (and perhaps other users) could use them.
> 
> Yup, I'd be all in for flexibility, and the performance impact should be
> small. I just wonder if the best approach is to pass the pt_regs pointer
> as a marker argument or to pass the individual registers.

Systemtap would rather have the individual registers than the pt_regs
pointer, since then we don't have to worry about the architecture
details of which registers should contain the args.  Since the
syscall_entry markers are in architecture-specific code, let that code
worry about architecture-dependent details.

> Since the LTTng serializer uses the format string to generically take
> the arguments and write them in a trace, I doubt that writing a pt_regs
> pointer is really useful. On the other hand, passing all the individual
> registers would imply a stack setup cost at runtime (small cost though),
> but would provide somewhat meaningful information in the traces (but
> redundant if we instrument the in-kernel functions).
> 
> Both approaches would let specific probes deal with the syscall
> arguments as they like.
> 
> If we choose to go for the pt_regs pointer passing solution, we could
> add a format string extension to specify that a given argument should
> not be written in the trace. If we pass the pt_regs like this :
> 
>   trace_mark(syscall_entry, "syscall_id %lu ip %p pt_regs #0%p",
>     regs->eax, instruction_pointer(regs), regs);
> 
> A LTTng probe would know that the #0 (# is a prefix to the format
> string element that tells LTTng what type size and format to use in the
> trace, independent of the size used on the gcc side) means that the data
> should be discarded from the trace.

As far as systemtap is concerned, I don't really have much of an opinion
on the '#0' format specifier, since systemtap will never use it
(systemtap users never see the format string anyway) and I believe we'd
rather have the individual registers anyway.

I'd suggest holding off on the '#0' until it is really needed.

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]