This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Linux Kernel Markers


On Tue, Sep 19, 2006 at 12:26:32PM -0700, Martin Bligh wrote:
> Vara Prasad wrote:
> >Martin Bligh wrote:
> >
> >>[...]
> >>Depends what we're trying to fix. I was trying to fix two things:
> >>
> >>1. Flexibility - kprobes seem unable to access all local variables etc
> >>easily, and go anywhere inside the function. Plus keeping low overhead
> >>for doing things like keeping counters in a function (see previous
> >>example I mentioned for counting pages in shrink_list).
> >>
> >Using tools like systemtap on can consult DWARF information and put 
> >probes in the middle of the function and access local variables as well, 
> >that is not the real problem. The issue here is compiler doesn't seem to 
> >generate required DWARF information in some cases due to optimizations.  
> 
> It seems difficult to seperate those two from each other. If the
> subsystem you're relying on doesn't work, then ....
> 
> >The other related problem is when there exists debug information, the 
> >way to specify the breakpoint location is using line number which is not 
> >maintainable, having a marker solves this problem as well. Your proposal 
> >still doesn't solve the need for markers if i understood correctly.
> 
> It could, but I think we're better off with the markers, yes.
> 
> >>2. Overhead of the int3, which was allegedly 1000 cycles or so, though
> >>faster after Ingo had played with it, it's still significant.
> >
> >The reason Kprobes use breakpoint instruction as pointed out by Prasanna 
> >is, it is atomic on most platforms. We are already working on an 
> >improved idea using jump instruction with which overhead is less than 
> >100 cycles on modern CPU's but it has some limitations and issues 
> >related to preemption and SMP.
> >
> >You can get a glimpse of some of the issues here
> >http://sourceware.org/ml/systemtap/2006-q3/msg00507.html
> >http://sourceware.org/ml/systemtap/2005-q4/msg00117.html
> >For more details do a search for djprobe in the systemtap mailing list 
> >(sorry i am not able to find few threads to summarize all the issues).
> 
> "This djprobe is NOT a replacement of kprobes. Djprobe and kprobes
> have complementary qualities. (ex: djprobe's overhead is low, and
> kprobes can be inserted in anywhere.)". Hmm. that seems problematic.
> 
> From what I was describing for function replacement, we could do an NMI
> IPI to everyone, and lock them in there whilst we insert the probe, but
> it's a bit sucky.

We can do batch processing here. Send one IPI to everyone 
and then insert bunch of jump instructions. This will reduce number
of IPI required here.

> 
> >Here is the algorithm djprobes uses to
> >
> >       IA
> >        | [-2][-1][0][1][2][3][4][5][6][7]
> >       [ins1][ins2][  ins3 ]
> >       [<-     DCR       ->]
> >          [<- JTPR ->]
> >
> >ins1: 1st Instruction
> >ins2: 2nd Instruction
> >ins3: 3rd Instruction
> >IA:  Insertion Address
> >JTPR: Jump Target Prohibition Region
> >DCR: Detoured Code Region
> >
> >
> >The replacement procedure of djpopbes is the following (i have 
> >simplified for readability the actual steps djprobes uses)
> >
> >(1) copying instruction(s) in DCR
> >(2) putting break point instruction at IA
> >(3) make sure no cpu's have replacing instructions in the cache to avoid 
> >jump to the middle of jmp instruction
> >(4) replacing original instruction(s) with jump instruction
> >
> >As you can see from the above your suggestion is very similar to the 
> >djprobes hence i believe all the issues related to djprobes will be 
> >valid for yours as well.
> 
> The hooking seems very similar, yes, perhaps I can be lazy and just
> steal djprobes for this. The difference is that if we just replace the
> whole function, we can just shove arbitrary changes into functions, and
> do whatever we please. Plus we don't have to worry about locating
> internal variables, etc.
> 

Some more coplicated method.
How about inserting a (instruction size) number of breakpoints and
wait untill all the threads gets scheduled atleast once (so that
threads would hit the breakpoint, if their IPs are in the middle of
instruction we want to replace with jump) and then replace with
jump instruction.

Thanks
Prasanna

-- 
Prasanna S.P.
Linux Technology Center
India Software Labs, IBM Bangalore
Email: prasanna@in.ibm.com
Ph: 91-80-41776329


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]