This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] [BZ15384] Enchance finite and isfinite.


On Sun, 21 Apr 2013, OndÅej BÃlka wrote:

On Sun, Apr 21, 2013 at 03:35:19PM +0200, Marc Glisse wrote:
On Sun, 21 Apr 2013, OndÅej BÃlka wrote:
However on x64 even gcc without optimizations expands finite to inline
version which is slower than my version(see benchmark).

This seems to depend on the CPU. Here:
model name	: Intel(R) Core(TM)2 Duo CPU     T9600  @ 2.80GHz

Cannot duplicate
on Intel(R) Core(TM)2 Quad  CPU   Q9300  @ 2.50GHz
and  Intel(R) Core(TM)2 Duo CPU     E7200  @ 2.53GHz
Could you try to run new version again?

However on AMD Phenom(tm) II X6 1090T Processor my results are below.

I fixed few mistakes in benchmark, now there should be correct version.
One problem is that we are affected by gcc bugs, particulary
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54349

Funny, when I run your example from comment #2 in that PR, -march=native helps. On the other hand, -march=native hurts in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57024#c1

Other explanation may be due bug in gcc that aligns loops only to 8
bytes. One implementation can get faster just because it is 16 byte
aligned so I changed that in assembly.

That sounds like a good reason.

current

real	0m0.816s
user	0m0.813s
sys	0m0.000s

new # from mail


real	0m0.738s
user	0m0.737s
sys	0m0.000s

opt # with fixed http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54349


real	0m0.703s
user	0m0.700s
sys	0m0.000s

nonzero #from PR


real	0m0.826s
user	0m0.820s
sys	0m0.003s

Different order here:
.81
.72
.76
.64

--
Marc Glisse


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]