This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH -tip v8 0/9] kprobes: Kprobes jump optimization support


* Masami Hiramatsu (mhiramat@redhat.com) wrote:
> Hi,
> 
> Here are the patchset of the kprobes jump optimization v8
> (a.k.a. Djprobe). This version is just moving onto
> 2.6.33-rc4-tip. Ingo, I assume its a good timing to
> push this code onto -tip tree (maybe developing branch?),
> since people can test it with perf-probe.
> 
> I've decided to make a separated series of patches of
> jump optimization with text_poke_smp() which is
> 'officially' supported on Intel's processors.
> So, this version of patches are just updated against
> the latest tip/master, no other updates are included.
> 
> I know that int3-bypassing method (text_poke_fixup())
> is currently unofficially believed as safe. But we
> need to get more official answers from x86 vendors.
> Moreover, we need to tweak entry_*.S for preventing
> recursive NMI, because int3 inside NMI handler will
> unblock NMI blocking. I'd like to push it after this
> series of patches are merged.
> 
> Anyway, thanks Mathieu and Peter, for helping me to
> implement it and organizing discussion points about
> int3-bypass XMC!
> 
> These patches can be applied on the latest -tip.
> 
> Changes in v8:
>  - Update patches against the latest tip/master.
>  - Drop text_poke_fixup() related patches.
>  - Update benchmark results and add jprobes and kprobe(post-handler)
>    results.
> 
> And kprobe stress test didn't found any regressions - from kprobes,
> under kvm/x86.
> 
> TODO:
>  - Support NMI-safe int3-bypassing text_poke.

Please have a look at:

"x86 NMI-safe INT3 and Page Fault"
http://git.kernel.org/?p=linux/kernel/git/compudj/linux-2.6-lttng.git;a=commit;h=90516e3c718e0502f6f2eb616fad4447645ca47d

and

"x86_64 page fault NMI-safe"
http://git.kernel.org/?p=linux/kernel/git/compudj/linux-2.6-lttng.git;a=commit;h=ad1bf11a68c35a44edd8d686a0842896f408e17c

That turns this TODO into the "done" section ;)

I've been using these patches in the lttng tree for 1-2 years.

Thanks,

Mathieu


>  - Support preemptive kernel (by stack unwinding and checking address).
> 
> 
> Jump Optimized Kprobes
> ======================
> o Concept
>  Kprobes uses the int3 breakpoint instruction on x86 for instrumenting
> probes into running kernel. Jump optimization allows kprobes to replace
> breakpoint with a jump instruction for reducing probing overhead drastically.
> 
> o Performance
>  An optimized kprobe 5 times faster than a kprobe.
> 
>  Optimizing probes gains its performance. Usually, a kprobe hit takes
> 0.5 to 1.0 microseconds to process. On the other hand, a jump optimized
> probe hit takes less than 0.1 microseconds (actual number depends on the
> processor). Here is a sample overheads.
> 
> Intel(R) Xeon(R) CPU E5410  @ 2.33GHz
> (without debugging options, with text_poke_smp patch, 2.6.33-rc4-tip+)
> 
> 			x86-32  x86-64
> kprobe:			0.80us  0.99us
> kprobe+booster:		0.33us  0.43us
> kprobe+optimized:	0.05us  0.06us
> kprobe(post-handler):	0.81us	1.00us
> 
> kretprobe :		1.10us  1.24us
> kretprobe+booster:	0.61us  0.68us
> kretprobe+optimized:	0.33us  0.30us
> 
> jprobe:			1.37us	1.67us
> jprobe+booster:		0.80us	1.10us
> 
> (booster skips single-stepping, kprobe with post handler
>  isn't boosted/optimized, and jprobe isn't optimized.)
> 
>  Note that jump optimization also consumes more memory, but not so much.
> It just uses ~200 bytes, so, even if you use ~10,000 probes, it just 
> consumes a few MB.
> 
> 
> o Usage
>  Set CONFIG_OPTPROBES=y when building a kernel, then all *probes will be
> optimized if possible.
> 
>  Kprobes decodes probed function and checks whether the target instructions
> can be optimized(replaced with a jump) safely. If it can't be, Kprobes just
> doesn't optimize it.
> 
> 
> o Optimization
>   Before preparing optimization, Kprobes inserts original(user-defined)
>  kprobe on the specified address. So, even if the kprobe is not
>  possible to be optimized, it just uses a normal kprobe.
> 
>  - Safety check
>   First, Kprobes gets the address of probed function and checks whether the
>  optimized region, which will be replaced by a jump instruction, does NOT
>  straddle the function boundary, because if the optimized region reaches the
>  next function, its caller causes unexpected results.
>   Next, Kprobes decodes whole body of probed function and checks there is
>  NO indirect jump, NO instruction which will cause exception by checking
>  exception_tables (this will jump to fixup code and fixup code jumps into
>  same function body) and NO near jump which jumps into the optimized region
>  (except the 1st byte of jump), because if some jump instruction jumps
>  into the middle of another instruction, it causes unexpected results too.
>   Kprobes also measures the length of instructions which will be replaced
>  by a jump instruction, because a jump instruction is longer than 1 byte,
>  it may replaces multiple instructions, and it checks whether those
>  instructions can be executed out-of-line.
> 
>  - Preparing detour code
>   Then, Kprobes prepares "detour" buffer, which contains exception emulating
>  code (push/pop registers, call handler), copied instructions(Kprobes copies
>  instructions which will be replaced by a jump, to the detour buffer), and
>  a jump which jumps back to the original execution path.
> 
>  - Pre-optimization
>   After preparing detour code, Kprobes enqueues the kprobe to optimizing list
>  and kicks kprobe-optimizer workqueue to optimize it. To wait other optimized
>  probes, kprobe-optimizer will delay to work.
>   When the optimized-kprobe is hit before optimization, its handler
>  changes IP(instruction pointer) to copied code and exits. So, those
>  copied instructions are executed on the detour buffer.
> 
>  - Optimization
>   Kprobe-optimizer doesn't start instruction-replacing soon, it waits
>  synchronize_sched for safety, because some processors are possible to be
>  interrupted on the middle of instruction series (2nd or Nth instruction)
>  which will be replaced by a jump instruction(*).
>  As you know, synchronize_sched() can ensure that all interruptions which
>  were executed when synchronize_sched() was called are done, only if
>  CONFIG_PREEMPT=n. So, this version supports only the kernel with
>  CONFIG_PREEMPT=n.(**)
>   After that, kprobe-optimizer calls stop_machine() to replace probed-
>  instructions with a jump instruction by using text_poke_smp().
> 
>  - Unoptimization
>   When unregistering, disabling kprobe or being blocked by other kprobe,
>  an optimized-kprobe will be unoptimized. Before kprobe-optimizer runs,
>  the kprobe just be dequeued from the optimized list. When the optimization
>  has been done, it replaces a jump with int3 breakpoint and original code
>  by using text_poke_smp().
> 
> (*)Please imagine that 2nd instruction is interrupted and
> optimizer replaces the 2nd instruction with jump *address*
> while the interrupt handler is running. When the interrupt
> returns to original address, there is no valid instructions
> and it causes unexpected result.
> 
> (**)This optimization-safety checking may be replaced with stop-machine
> method which ksplice is done for supporting CONFIG_PREEMPT=y kernel.
> 
> 
> Thank you,
> 
> ---
> 
> Masami Hiramatsu (9):
>       kprobes: Add documents of jump optimization
>       kprobes/x86: Support kprobes jump optimization on x86
>       x86: Add text_poke_smp for SMP cross modifying code
>       kprobes/x86: Cleanup save/restore registers
>       kprobes/x86: Boost probes when reentering
>       kprobes: Jump optimization sysctl interface
>       kprobes: Introduce kprobes jump optimization
>       kprobes: Introduce generic insn_slot framework
>       kprobes/x86: Cleanup RELATIVEJUMP_INSTRUCTION to RELATIVEJUMP_OPCODE
> 
> 
>  Documentation/kprobes.txt          |  191 ++++++++++-
>  arch/Kconfig                       |   13 +
>  arch/x86/Kconfig                   |    1 
>  arch/x86/include/asm/alternative.h |    4 
>  arch/x86/include/asm/kprobes.h     |   31 ++
>  arch/x86/kernel/alternative.c      |   60 +++
>  arch/x86/kernel/kprobes.c          |  596 ++++++++++++++++++++++++++++------
>  include/linux/kprobes.h            |   44 +++
>  kernel/kprobes.c                   |  626 +++++++++++++++++++++++++++++++-----
>  kernel/sysctl.c                    |   12 +
>  10 files changed, 1373 insertions(+), 205 deletions(-)
> 
> -- 
> Masami Hiramatsu
> 
> Software Engineer
> Hitachi Computer Products (America), Inc.
> Software Solutions Division
> 
> e-mail: mhiramat@redhat.com
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]