This is the mail archive of the gdb@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

SIGSEGV on exit from subroutines -- problem with non-stop ?


Hi,

I am using gdb 7.2-14.fc14 to work on a large multi-threaded
application, in C, x86-64.

I have .gdbinit, per the book:

  set target async 1
  set pagination off
  set non-stop on

When I step using 's' or 'n', as it leaves some subroutines I keep
getting SIGSEGV, such as:

  Program received signal SIGSEGV, Segmentation fault.
  signal_set (signo=Cannot access memory at address
0xffffffffffffff5c)
  at ...

When I 'disass' the current instruction is a leaveq.  Examining the
registers I observe that rbp is zero, which is clearly nonsense.

I found one instance which was repeatable, which happened to be before
any threads were started: if I 'ni' through a particular function, it
gets to the leaveq, and gets stuck there.  Each time I do ni, the rsp
and the rbp are updated by the repeated leaveq, until it goes bang.

So... I began to think this isn't something complicated to do with
multiple threads... so here is a test:

<<--test.c-----------------------------------------------
#include <stdio.h>
#include <stdlib.h>

static void
target(const char* message) {
	printf("%s ...BANG!\n", message) ;
}

int main(int argc, char* argv[]) {

	target("Light the blue touch paper") ;

	return 0 ;
}
------------------------------------------------------->>

Compiled by gcc 4.5.1 "-g -O0".

If I do "gdb test", stepping by "n":

<<-------------------------------------------------------
(gdb) show non-stop
Controlling the inferior in non-stop mode is on.
(gdb) b target
Breakpoint 1 at 0x4004d0: file test.c, line 6.
(gdb) run
Starting program: ...........test 

Breakpoint 1, target (message=0x400615 "Light the blue touch paper")
at test.c:6
6		printf("%s ...BANG!\n", message) ;
(gdb) n
Light the blue touch paper ...BANG!
7	}
(gdb) n

Program received signal SIGSEGV, Segmentation fault.
target (message=Cannot access memory at address 0xfffffffffffffff8
) at test.c:7
7	}
(gdb) info reg
....
rbp   0x0         	0x0
rsp   0x7fffffffe248	0x7fffffffe248
....
rip   0x4004e9		0x4004e9 	<target+37>
....
------------------------------------------------------->>

Or, stepping by 'ni':

<<-------------------------------------------------------
(gdb) show non-stop
Controlling the inferior in non-stop mode is on.
(gdb) b target
Breakpoint 1 at 0x4004d0: file test.c, line 6.
(gdb) disass target
Dump of assembler code for function target:
   0x00000000004004c4 <+0>:	push   %rbp
   0x00000000004004c5 <+1>:	mov    %rsp,%rbp
   0x00000000004004c8 <+4>:	sub    $0x10,%rsp
   0x00000000004004cc <+8>:	mov    %rdi,-0x8(%rbp)
   0x00000000004004d0 <+12>:	mov    $0x400608,%eax
   0x00000000004004d5 <+17>:	mov    -0x8(%rbp),%rdx
   0x00000000004004d9 <+21>:	mov    %rdx,%rsi
   0x00000000004004dc <+24>:	mov    %rax,%rdi
   0x00000000004004df <+27>:	mov    $0x0,%eax
   0x00000000004004e4 <+32>:	callq  0x4003b8 <printf@plt>
   0x00000000004004e9 <+37>:	leaveq 
   0x00000000004004ea <+38>:	retq   
End of assembler dump.
(gdb) disp/i $pc
(gdb) run
Starting program: .......test 

Breakpoint 1, target (message=0x400615 "Light the blue touch paper")
at test.c:6
6		printf("%s ...BANG!\n", message) ;
.....
1: x/i $pc
=> 0x4004e4 <target+32>:	callq  0x4003b8 <printf@plt>
(gdb) ni
Light the blue touch paper ...BANG!
7	}
1: x/i $pc
=> 0x4004e9 <target+37>:	leaveq 
(gdb) ni
target (message=0x100000000 <Address 0x100000000 out of bounds>) at
test.c:7
7	}
1: x/i $pc
=> 0x4004e9 <target+37>:	leaveq 
(gdb) ni
Cannot access memory at address 0x8
(gdb) ni
The program is not being run.
------------------------------------------------------->>

I note that if I turn off the "non-stop" option, it works.  So this is
something to do with debugging multi-threaded !

I note also that if I change the target to:

  static int
  target(const char* message) {
          printf("%s ...BANG!\n", message) ;
          return 0 ;
  }

the problem goes away... so one extra instruction between the callq
and the leaveq makes a difference:

   0x00000000004004dc <+24>:	mov    %rax,%rdi
   0x00000000004004df <+27>:	mov    $0x0,%eax
   0x00000000004004e4 <+32>:	callq  0x4003b8 <printf@plt>
   0x00000000004004e9 <+37>:	mov    $0x0,%eax
   0x00000000004004ee <+42>:	leaveq 
   0x00000000004004ef <+43>:	retq   

This goes some way to explaining why it appeared to be a sporadic
problem.

Is this me, or is this a bug ?  It used to work :-(

Thanks,

Chris


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]