This is the mail archive of the
gdb-patches@sources.redhat.com
mailing list for the GDB project.
RE: Single step vs. "tail recursion" optimization
- From: "Donn Terry" <donnte at microsoft dot com>
- To: "Michael Snyder" <msnyder at redhat dot com>
- Cc: <gdb-patches at sources dot redhat dot com>
- Date: Fri, 8 Nov 2002 13:57:36 -0800
- Subject: RE: Single step vs. "tail recursion" optimization
(I'm sorry to have to be the messenger on this one...)
Here's a mini testcase. I've also attached the resulting .s files for
-O2 and -O3.
Shudder. Andrew's speculation about s not working because there were no
symbols
is correct. S-ing works until the call to getpid().
I haven't actually tried to figure out why gdb isn't doing it right in
that case
because there's actually something potentially even uglier going on in
the -O3 case.
This is something that the "management" of gdb and the "management" of
gcc are going
to have to take on and resolve as either "no, you can't sanely debug
-O3" or "we need
some help from the compiler to sort this one out". (And if the latter,
then the same
help may be useful with the -O2 case!) (I haven't seen this addressed,
but I could
easily have missed it.)
Note that in the case of -O3, foo() and bar() are NEVER actually called
from main,
but rather getpid() is called directly. (Note also the reordering of the
functions.)
(Seeing that this sort of optimization is pretty compellingly needed for
C++ code,
"don't do that" seems an unlikely outcome.)
Donn
P.S. This may explain some instances of "stack unwind missed a frame"
bugs.
bar() {
getpid();
}
foo() {
bar();
}
main()
{
foo();
}
------------------ -O2 -------------------
.file "bat.c"
.global __fltused
.text
.p2align 4,,15
.globl _bar
.def _bar; .scl 2; .type 32; .endef
_bar:
pushl %ebp
movl %esp, %ebp
popl %ebp
jmp _getpid
.p2align 4,,15
.globl _foo
.def _foo; .scl 2; .type 32; .endef
_foo:
pushl %ebp
movl %esp, %ebp
popl %ebp
jmp _bar
.def ___main; .scl 2; .type 32; .endef
.p2align 4,,15
.globl _main
.def _main; .scl 2; .type 32; .endef
_main:
pushl %ebp
movl %esp, %ebp
pushl %eax
pushl %eax
xorl %eax, %eax
andl $-16, %esp
call __alloca
call ___main
call _foo
movl %ebp, %esp
popl %ebp
ret
.def _getpid; .scl 2; .type 32; .endef
------------------------ -O3 ---------------------------------
.file "bat.c"
.global __fltused
.def ___main; .scl 2; .type 32; .endef
.text
.p2align 4,,15
.globl _main
.def _main; .scl 2; .type 32; .endef
_main:
pushl %ebp
movl %esp, %ebp
pushl %eax
pushl %eax
xorl %eax, %eax
andl $-16, %esp
call __alloca
call ___main
call _getpid <<< NO CALL TO foo()
movl %ebp, %esp
popl %ebp
ret
.p2align 4,,15
.globl _bar
.def _bar; .scl 2; .type 32; .endef
_bar:
pushl %ebp
movl %esp, %ebp
popl %ebp
jmp _getpid
.p2align 4,,15
.globl _foo
.def _foo; .scl 2; .type 32; .endef
_foo:
pushl %ebp
movl %esp, %ebp
popl %ebp
jmp _getpid <<< NOTE THAT foo() doesn't call
bar() either!
.def _getpid; .scl 2; .type 32; .endef
-----Original Message-----
From: Michael Snyder [mailto:msnyder@redhat.com]
Sent: Friday, November 08, 2002 11:43 AM
To: Donn Terry
Cc: gdb-patches@sources.redhat.com
Subject: Re: Single step vs. "tail recursion" optimization
Donn Terry wrote:
>
> While debugging gdb, I ran across a really nasty little issue: the gcc
> guys (for the "bleeding edge", at least) have generated an
> optimization such that if the last thing in function x is a function
> call to y, it will short circut the return from x, and set things up
> so it returns directly from y. (A special case of tail recursion
> optimizations.)
>
> If you try to n (or s) over that, the debugged program runs away
> because gdb doesn't know about that magic. The real example is
> regcache_raw_read, which ends in a memcpy. Instead of jsr-ing to the
> memcpy and then returning, it fiddles with the stack and jmps to
> memcpy. Is this a known issue, and is it being worked, or have I just
> run across something new to worry about?
>
> (This is on Interix (x86, obviously from the code below) with a gcc
> that's less than a week old. I have no idea how long it might
> actually have been this way. I doubt
> the problem is actually unique to the x86 as this is a very general
> optimization.)
>
> Donn
Tail-recursion isn't a new optimization, but I have almost no (only the
vaguest) recollection of ever having run up against
it before. Could be there's a change with the way GCC is
implementing it. Could be we never handled it before.
This sounds like a good argument for parsing the epilogue... ;-(
Michael
>
> Heres the code:
>
> 0x466e37 <regcache_raw_read+151>: mov 0x1c(%eax),%ecx
> 0x466e3a <regcache_raw_read+154>: mov 0x18(%eax),%eax
> 0x466e3d <regcache_raw_read+157>: mov (%eax,%esi,4),%edx
> 0x466e40 <regcache_raw_read+160>: mov 0x4(%ebx),%eax
> 0x466e43 <regcache_raw_read+163>: add %eax,%edx
> 0x466e45 <regcache_raw_read+165>: mov (%ecx,%esi,4),%eax
> 0x466e48 <regcache_raw_read+168>: mov %eax,0x10(%ebp)
> 0x466e4b <regcache_raw_read+171>: mov %edx,0xc(%ebp)
> 0x466e4e <regcache_raw_read+174>: mov %edi,0x8(%ebp)
> 0x466e51 <regcache_raw_read+177>: lea 0xfffffff4(%ebp),%esp
> 0x466e54 <regcache_raw_read+180>: pop %ebx
> 0x466e55 <regcache_raw_read+181>: pop %esi
> 0x466e56 <regcache_raw_read+182>: pop %edi
> 0x466e57 <regcache_raw_read+183>: pop %ebp
> 0x466e58 <regcache_raw_read+184>: jmp 0x77d91e60 <memcpy>
> 0x466e5d <regcache_raw_read+189>: lea 0x0(%esi),%esi