This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
Re: GProf's sampling inaccuracies
On Saturday 23 June 2007 15:18, Segher Boessenkool wrote:
> > For some strange reason the optimized version (that is, built from the
> > "optimized" profile in KDevelop) didn't give me this trouble. However,
> > the
> > execution time for CompareReferences() went up 70%. God help me if
> > that's
> > more accurate!
>
> No idea. Could be KDevelop's fault, could be anything
> else.
Then let's look at it from an assembly point-of-view. I ran:
gcc -O2 -DNDEBUG -c -Wa,-a,-ad productfields.cpp
which is a good approximation of the compile call--with the assembly
extensions--made in the build profile for compiling the module w/out
profiling, but with profiling at link-time. This is the assembly output for
CompareReferences():
134 0010 8B06 movl (%rsi), %eax
135 0012 394218 cmpl %eax, 24(%rdx)
136 0015 0F94C0 sete %al
137 0018 0FB6C0 movzbl %al, %eax
138 001b C3 ret
This is on an AMD64 remember. KDevelop couldn't have anything to do with it,
since it didn't run it in that environment, but a shell. When I run the
program with profiling on the build I get 5743514 calls to
CompareReferences() in the flat profile. When I run it with profiling just on
the link, the self-seconds are: 1.85. This comes to an average of 322ns/call.
How do we explain that? Do we conclude that those 5743514 calls to movl and
5743514 calls to cmpl where all cache misses?