This is the mail archive of the gdb@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

remote target single step


My question relates to debugging a remote system at a
low level (kernel debugging, if you will) using
"target remote" where a gdb stub implements control.

If the CPU does not have hardware single step support,
then single step in response to the "s" remote protocol
request has to be implemented by writing break-type
instructions into the code.  What kills us is that
other threads of control can come in and traverse these
single step breakpoints.  Interrupts, and other CPUs in
a multiprocessor system, are typical examples of these
errant threads of control.

The first thought is to tame them by, say, capturing other
CPUs into a spin loop across a gdb single step.  Or to
disable interrupts across a single step.  But now we come
to it: gdb frequently sends a "c" remote protocol request
when the user asks gdb to step or next.  "c" is an
instruction to the stub to continue in open ended fashion.
The stub has no context to know if control will return in
one cycle, a billion, or never.  So it cannot leave
interrupts disabled; it cannot hold other CPUs quiesced.
It must let everything go.  And then we're back to square
one: if this *was* a single step, these threads of control
can hit the single step breakpoint.

Why is it bad for other threads of control to hit single
step breakpoints?  It *is* bad, and maybe gdb experts could
explain the mechanism in detail.  The symptom is gdb starts
single stepping in the context of this unwanted thread of
control, say an interrupt routine.  Control never returns
in the context the user had issued the step from.

Can the user code space be partitioned from the interrupt
handling code space?  Another interesting gdb behavior makes
this very difficult, at least in our environment.  When gdb
is told to "next" (step across) a subroutine call, the remote
protocol first does a step INTO the subroutine, reads the
return address register, puts a breakpoint there, and then
does a continue.  This means even if users are taught to
not step into certain sensitive routines, but to "next"
across them in order to avoid loss of control, they cannot
actually avoid perturbing the code of the sensitive routine.

This is the rather dismal situation as we currently
understand it.  How have other folks handled kernel debugging
in the face of these issues?  Or are there gdb configuration
options to solve some of them?  I'd love to set a configure
flag and, voila, gdb uses software breakpoints on the other
side of subroutine calls instead of the "step; finish" method.

Kevin Nomura
Network Appliance


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]