This is the mail archive of the systemtap@sources.redhat.com mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [RFC PATCH 0/6] kprobes: remove global kprobe_lock

From: Ananth N Mavinakayanahalli <amavin at redhat dot com>
To: Mathieu Desnoyers <compudj at krystal dot dyndns dot org>
Cc: Ananth N Mavinakayanahalli <ananth at in dot ibm dot com>, systemtap at sources dot redhat dot com, ak at muc dot de, davem at davemloft dot net
Date: Fri, 29 Jul 2005 15:32:46 -0400
Subject: Re: [RFC PATCH 0/6] kprobes: remove global kprobe_lock
References: <20050729185150.GM1692@in.ibm.com> <20050729191402.GA789@Krystal>

Mathieu Desnoyers wrote:

* Ananth N Mavinakayanahalli (ananth@in.ibm.com) wrote:

Hi,

The following set of patches replaces the global spinlock (kprobe_lock)
with an rwlock. With this change, it is now possible to have parallel
execution of kprobes (same/different), without having to spin on the
kprobe_lock. Of course, it is required that the handlers are reentrant
so as to obtain accurate results, or the handlers have to take care of
serializing in case they share variables (counters for example).

Well, I looks like a problem I had to face in my LTT experimental
implementation. Your goal, as tracing should have a minimal impact on
performances, is to have the fastest locks in the critical path. In fact, rwlock
are not made for that purpose : they keep a counter of readers, which should
therefore make that value bounce from one cpu to another causing a cache
invalidation.


Yes, cacheline bouncing is an issue. But, note that this is just the
first step in making kprobes scalable. And since context switches are
involved with kprobes in any case due to breakpointing and
single-stepping, I don't know if it is that big an issue.

As it is said in the Documentation/Docbook/ about locking, rwlocks are made to
protect paths of data which take a long time to execute, or otherwize it does
not worth the performance cost compared to the contention caused by a spinlock.

If what you are looking for is scalability, here is the two final locking
scheme I came up with :

* Use atomic operations (no locking at all)
* Use a per_cpu spinlock :
  This is interesting in a case where you almost never write to a data
  structure, but read it really often. Here is the basic idea :
  - writers take every spinlock in the very same order. Once a writer has them
    all, it has write access to the structure.
  - readers only take their per cpu spinlock. It insure that no writer is
    currently modifying the data.


And then there is RCU. With RCU you can run handlers without *any*
locking.

Someone, at OLS, suggested that it was like the brlock (for big reader lock).
The subtile enhancements of my implementation is the use of per_cpu variables to
hold the spinlocks, benefits :
  - No false sharing of the spinlocks.
  - Does no waste precious cpu cache space by aligning each spinlock on cache
    line boundaries : it's implicit in the per_cpu variables.

What do you think about it ?

Well, I have a RCU based prototype which could potentially be the fastest of the lot. However, I saw some wierd issues on an 8-way x86 smp with a kprobe on "schedule" and "make -j8" of the linux kernel. rmmod on the kprobe module never returned. Still working on it. Needs more polishing and testing :-)

I've heard from most users that they don't care about the probe insertion/removal overheads, only that they are concerned with handler execution times.

The goal is to finally have an RCU based mechanism for kprobes. The rwlock patchset is the first step in that direction.

Ananth

Follow-Ups:
- Re: [RFC PATCH 0/6] kprobes: remove global kprobe_lock
  - From: Mathieu Desnoyers

References:
- [RFC PATCH 0/6] kprobes: remove global kprobe_lock
  - From: Ananth N Mavinakayanahalli
- Re: [RFC PATCH 0/6] kprobes: remove global kprobe_lock
  - From: Mathieu Desnoyers

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]