This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
GProf's sampling inaccuracies
- From: Angus <angus at uducat dot com>
- To: binutils at sourceware dot org
- Date: Tue, 19 Jun 2007 11:14:46 -0400
- Subject: GProf's sampling inaccuracies
I'm profiling my C++ program and there are results that don't make a lot of
sense to me. It seems that some functions are taking much more time than they
look like they should. The best example is this one that does only one thing,
and that thing is an integer compare. Using different data, I can get the
flat profile to consistently report an average of about 275 self nanoseconds
per call. All that time just to compare two integers? Here's what it looks
like:
bool ProductFields::CompareReferences(const ID &id, const ProductField &field)
const {
return field.id == id;
}
That's it. ID is typedef'd to an unsigned int. ProductField is a class and
ProductField::id is also an ID.
This method is a virtual that descends from a pure virtual, which is called
from the parent. I can understand a minor performance hit being a virtual,
although, I suspect the sample taken at the time of managing the virtuality
part would register with the caller. Either way, I wouldn't think it enough
to create a 275ns burden! I even tried applying the GNU attributes const
(which probably does nothing, since it is never called from a loop) and
nothrow. (I didn't bother with fastcall, since I'm running on an AMD64)
I've noticed some of the STL methods are pretty slow, too. The
__normal_iterator constructor, and vector<>::size() keep coming up, with a
more modest 25ns-100ns per call. I've tried to find the code for
vector<>::size(), which appears to be the subtraction of 2 iterators. That
doesn't look like it should be taking that long either.
Could the problem be my understanding of the statistical inaccuracies with
GProf? From what I read on
http://www.gnu.org/software/binutils/manual/gprof-2.9.1/html_chapter/gprof_6.html#SEC19:
n=R/s
and
E=sqrt(n)*s
where
n: number of samples taken
R: reported self seconds
s: sampling period
E: the amount of the error
so if
A=R (+/-) E
where
A: actual number of self seconds
(+/-): plus or minus
then a quick bit of algebra yields:
A=R (+/-) sqrt(R*s)
So unless my nondeterministic program has the very bad fortune of running such
that CompareReferences() keeps getting called right when a sample is about to
be taken, the error should not be that severe when the self seconds are
measured in seconds, and the sampling rate is 0.01 seconds.