This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] Systemtap translator support for hardware breakpoints on


On Thu, Jan 07, 2010 at 05:53:01PM -0800, Roland McGrath wrote:
> Another thing to note is that hw_breakpoint registrations can fail at
> runtime for "normal" reasons.  The hardware watchpoints are a scarce global
> resource (4 on x86, only 1 on powerpc).
> 
> The translator might encode some per-arch assumptions about the known
> kernel implementation and so give you a warning/error if your script as
> elaborated statically requires more registrations than the kernel ever
> supports (i.e. more than 4 on x86, more than 1 on powerpc).
> 
> But even if you only use one, it might not be free at runtime.  What I
> presume your code does is just fail module initialization if registration
> fails, so staprun just fails quickly and never even tries to run the
> script.
> 
> Here's an idea for being fancier:
> 
> 	probe kernel.data($foo).change { ... }
> 	probe kernel.data($foo).change.unavailable {
> 	  pred = 1
> 	  println("NOTE: watchpt not available, using plan B")
> 	}
> 
> 	probe kernel.function("foo1") if pred { ... }
> 	probe kernel.function("foo2") if pred { ... }
> 	probe kernel.function("foo3") if pred { ... }
> 
> If there is no .unavailable probe given for some watchpoint probe, then
> the default is to fail at initialization time.  If there is one, it acts
> like a .begin probe.
> 
> That brings up a similar tangential idea.  If we have the dynamic
> watchpoint probes in some form, then:
> 
> 	probe watch_foo(loc) = .data(loc).change { ... }
> 	probe watch_foo(loc) = .data(loc).change.begin { ... }
> 	probe watch_foo(loc) = .data(loc).change.end { ... }
> 
> Those .begin and .end would run when "enable watch_foo" and "disable
> watch_foo" happens.  The idea is that a tapset could define watch_foo
> for watching a particular kind of thing in some fancy way, and then use
> these to initialize and destroy elements in global state arrays it might
> keep for each watchpoint instance to use across multiple changes to the
> same variable.  For example, when watching a pointer variable you could
> be tracking in some tapset-global array of used/live pointers of that
> sort.
>

If my understanding is correct, this is a suggestion that demands an
'overcommit' feature (ability to accept requests more than the available
debug registers) in hw-breakpoints, right?

In its new form (post perf-events integration), hw-breakpoints can
indeed accept new requests that far exceed the number of underlying
debug registers. This can be achieved by making an 'un-pinned' breakpoint
request, where every such request gets a chance to use the debug
register in a round-robin fashion (all this is provided by perf-events
infrastructure anyway).

In other words, on x86, it is possible to have, say 3 'pinned'
breakpoint requests (which would dedicatedly consume 1 debug register
each) and any number of 'un-pinned' breakpoint requests (which would be
scheduled to 'use' the remaining 1 debug register in RR fashion).

Presently, the breakpoint infrastructure does not provide callbacks that
can be invoked whenever an 'un-pinned' breakpoint request is
scheduled-in/out (analogous to .enabled and .disabled). We could pursue
to get support for the same (of course, that would require a good
in-kernel user to convince the community!).

Thanks,
K.Prasad


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]