This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: new static user probe types


On Wed, 2009-07-22 at 20:06 -0700, Roland McGrath wrote:
> > You would make sure that there are enough nops in the place of the probe
> > point for the instruction sequence you want to replace and then the
> > uprobes insert instruction mechanism would (after checking it had enough
> > nop space) insert the instruction sequence (preferable the one used by
> > the utrace mechanism).
> 
> It can be more precisely-tailored than that, you don't need to think of it
> as being a "uprobes method" at all.  It's very simple hard-wired code patching.
> i.e., the macro produces one long nop and you patch that to a relative call.
> You can make it a call to a stock function we provide in some .a you link
> with, or to a stub generated directly in an alternate section by the macros.
> (If you don't need different stubs, it could be in a linkonce section.)
>
> > It would also help with implementing the idea for the ENABLED mechanism
> 
> That's just another variant of code-patching for the same purpose.
> 
> > So, it might be a bit like what Srikar posted to utrace-devel: [...]
> 
> By which you just mean it's another kind of code-patching.

Yes. I am thinking of it as a "uprobes method" since that already
contains the UBP mechanism, interfaces and data structures we would need
to make this more general.

> > Or how about this.  We could expand STAP_PROBE(...) to
> > 
> >    { extern char stap_probe_NNNN_enabled_p;
> >      if (unlikely(stap_probe_NNN_enabled_p)) {
> >         /* current inline-asm stuff, but adding
> >            &enabled_p to the descriptor struct. */
> >      }
> >    }
> 
> The point of this is to skip any argument-packing work generated by the
> compiler, which would be inside the "if unlikely" block, right?

Partly, but it is sadly not guaranteed by GCC.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40207
The main speed win would be not tripping over the "probe trigger".
And the advantage to have a generalized approach to being able to check
whether a probe is enabled or not.

> > Certainly warrants a try and benchmark.
> 
> I think this is a lot like some things Mathieu already experimented with
> and measured in the kernel context.  I think he pursued a code-patching
> flavor that patched an immediate operand because that was measured as
> faster than having the actual extra load of a simple enabled_p variable.

That sounds like what I proposed in
http://sourceware.org/bugzilla/show_bug.cgi?id=10013#c2
the disadvantage is that it needs some code-patching magic. The
advantage of the above approach is that it wouldn't need anything not
already in the kernel mainline, just tricking gcc and the preprocessor
to setup things like we would want.

> > BTW. For storing changeable variables the .probes section should become
> > alloc, rw now always (it currently is only for relocatable objects). 
> 
> It doesn't make sense that it should differ in relocatable objects.
> I don't understand that.

The .probes section stores the addresses of the generated labels that
are used to find the probe addresses. This isn't a problem for an
executable that isn't relocatable. But it is for shared libraries which
has relocatable addresses (if you have selinux memory protection turned
on). In that case the section has to be writable for the linker. See
http://sourceware.org/bugzilla/show_bug.cgi?id=10381

Cheers,

Mark


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]