This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFA] gdbserver/lynx178: spurious SIG61 signal when resuming inferior.


Hi Joel,

On 05/13/2013 11:46 AM, Joel Brobecker wrote:

> On ppc-lynx178, resuming the execution of a program after hitting
> a breakpoint sometimes triggers a spurious SIG61 event:

I'd like to understand this a little better.

Could that mean the thread that gdbserver used for ptrace hadn't
been ptrace stopped, or doesn't exist at all?  "sometimes" makes
me wonder about the latter.

>     (gdb) cont
>     Continuing.
> 
>     Program received signal SIG61, Real-time event 61.
>     [Switching to Thread 39]
>     0x10002324 in a_test.task1 (<_task>=0x3ffff774) at a_test.adb:30
>     30          select  -- Task 1
> 
> From this point on, continuing again lets the signal kill the program.
> Using "signal 0" or configuring GDB to discard the signal does not
> help either, as the program immediately reports the same signal again.
> 
> What happens is the following:
> 
>   - GDB sends a single-step order to gdbserver: $vCont;s:31
>     This tells GDBserver to do a step using thread 0x31=49.
>     GDBserver does the step, and thread 49 receives the SIGTRAP
>     indicating that the step has finished.
> 
>   - GDB then sends a "continue", but this time does not specify
>     which thread to continue: $vCont;c
>     GDBserver uses an arbitrary thread's ptid to resume the program's
>     execution (the current_inferior's ptid was chosen for that).
>     See lynx-low.c:lynx_resume:

Urgh.

So does that mean scheduler locking doesn't work?

E.g.,

(gdb) thread 2
(gdb) si
(gdb) thread 1
(gdb) c

That'll single-step thread 2, and then continue just thread 1, supposedly
triggering this issue too?  If not, why not?

BTW, vCont;c means "resume all threads", why is the current code just
resuming one?

This:

lynx_wait_1 ()
...
  if (ptid_equal (ptid, minus_one_ptid))
    pid = lynx_ptid_get_pid (thread_to_gdb_id (current_inferior));
  else
    pid = BUILDPID (lynx_ptid_get_pid (ptid), lynx_ptid_get_tid (ptid));

retry:

  ret = lynx_waitpid (pid, &wstat);


is suspicious also.  Doesn't that mean we're doing a waitpid on
a possibly not-resumed current_inferior (that may not be the main task,
if that matters)?  Could _that_ be reason for that magic signal 61?

-- 
Pedro Alves


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]