This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH -tip 4/5] kprobes/x86: Use text_poke_smp_batch


* Masami Hiramatsu (mhiramat@redhat.com) wrote:
> Mathieu Desnoyers wrote:
> > * Masami Hiramatsu (mhiramat@redhat.com) wrote:
> >> Mathieu Desnoyers wrote:
> >>> * Masami Hiramatsu (mhiramat@redhat.com) wrote:
> >>>> Use text_poke_smp_batch() in optimization path for reducing
> >>>> the number of stop_machine() issues.
> >>>>
> >>>> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
> >>>> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
> >>>> Cc: Ingo Molnar <mingo@elte.hu>
> >>>> Cc: Jim Keniston <jkenisto@us.ibm.com>
> >>>> Cc: Jason Baron <jbaron@redhat.com>
> >>>> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> >>>> ---
> >>>>
> >>>>  arch/x86/kernel/kprobes.c |   37 ++++++++++++++++++++++++++++++-------
> >>>>  include/linux/kprobes.h   |    2 +-
> >>>>  kernel/kprobes.c          |   13 +------------
> >>>>  3 files changed, 32 insertions(+), 20 deletions(-)
> >>>>
> >>>> diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
> >>>> index 345a4b1..63a5c24 100644
> >>>> --- a/arch/x86/kernel/kprobes.c
> >>>> +++ b/arch/x86/kernel/kprobes.c
> >>>> @@ -1385,10 +1385,14 @@ int __kprobes arch_prepare_optimized_kprobe(struct optimized_kprobe *op)
> >>>>  	return 0;
> >>>>  }
> >>>>  
> >>>> -/* Replace a breakpoint (int3) with a relative jump.  */
> >>>> -int __kprobes arch_optimize_kprobe(struct optimized_kprobe *op)
> >>>> +#define MAX_OPTIMIZE_PROBES 256
> >>>
> >>> So what kind of interrupt latency does a 256-probes batch generate on the
> >>> system ?  Are we talking about a few milliseconds, a few seconds ?
> >>
> >> From my experiment on kvm/4cpu, it took about 3 seconds in average.
> > 
> > That's 3 seconds for multiple calls to stop_machine(). So we can expect
> > latencies in the area of few microseconds for each call, right ?
> 
> Sorry, my bad. Non tuned kvm guest is so slow...
> I've tried to check it again on *bare machine* (4core Xeon 2.33GHz, 4cpu).
> I found that even without this patch, optimizing 256 probes took 770us in
> average (min 150us, max 3.3ms.)
> With this patch, it went down to 90us in average (min 14us, max 324us!)
> 
> Isn't it enough low latency? :)
> 
> >> With this patch, it went down to 30ms. (x100 faster :))
> > 
> > This is beefing up the latency from few microseconds to 30ms. It sounds like a
> > regression rather than a gain to me.
> 
> so, it just takes 90us. I hope it is acceptable.

Yes, this is far below the scheduler tick, which is much more acceptable.

Thanks,

Mathieu

> 
> Thank you,
> 
> 
> -- 
> Masami Hiramatsu
> e-mail: mhiramat@redhat.com

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]