This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

GProf's sampling inaccuracies


I'm profiling my C++ program and there are results that don't make a lot of 
sense to me. It seems that some functions are taking much more time than they 
look like they should. The best example is this one that does only one thing, 
and that thing is an integer compare. Using different data, I can get the 
flat profile to consistently report an average of about 275 self nanoseconds 
per call. All that time just to compare two integers? Here's what it looks 
like:

bool ProductFields::CompareReferences(const ID &id, const ProductField &field) 
const {
	return field.id == id;
}

That's it. ID is typedef'd to an unsigned int. ProductField is a class and 
ProductField::id is also an ID.
	This method is a virtual that descends from a pure virtual, which is called 
from the parent. I can understand a minor performance hit being a virtual, 
although, I suspect the sample taken at the time of managing the virtuality 
part would register with the caller. Either way, I wouldn't think it enough 
to create a 275ns burden! I even tried applying the GNU attributes const 
(which probably does nothing, since it is never called from a loop) and 
nothrow. (I didn't bother with fastcall, since I'm running on an AMD64)
	I've noticed some of the STL methods are pretty slow, too. The 
__normal_iterator constructor, and vector<>::size() keep coming up, with a 
more modest 25ns-100ns per call. I've tried to find the code for 
vector<>::size(), which appears to be the subtraction of 2 iterators. That 
doesn't look like it should be taking that long either.

	Could the problem be my understanding of the statistical inaccuracies with 
GProf? From what I read on 
http://www.gnu.org/software/binutils/manual/gprof-2.9.1/html_chapter/gprof_6.html#SEC19:

n=R/s
and
E=sqrt(n)*s
where
	n: number of samples taken
	R: reported self seconds
	s: sampling period
	E: the amount of the error
so if
A=R (+/-) E
where
	A: actual number of self seconds
	(+/-): plus or minus
then a quick bit of algebra yields:
A=R (+/-) sqrt(R*s)

So unless my nondeterministic program has the very bad fortune of running such 
that CompareReferences() keeps getting called right when a sample is about to 
be taken, the error should not be that severe when the self seconds are 
measured in seconds, and the sampling rate is 0.01 seconds.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]