This is the mail archive of the glibc-bugs@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug math/602] powerpc rint() function is buggy in the rounding toward -inf and +inf modes


------- Additional Comments From sjmunroe at us dot ibm dot com  2004-12-22 21:27 -------
Ok, I found the problem in rint and floor and realized that it would be
simpler/faster to force the sign bit to the correct value (using fasb/fnabs
instructions) then to compare/branch for x == 0.0 case. The fabs/fnabs has a
latency of 6 cycles, while the fcmpu/bc has a latency of 8+2.

This simplies the the code and allows the ellimination of the -0.0 constant used
in some of the rounding functions. Finally I noticed that some of the float
(i.e. roundf, rintf, ...) functions were still using double constant values. so
I decides to clean up the whole of rounding functions (ceil, floor, rint, round,
trunc) for float and double. 

This gives a nice performance boost for POWER4, 970 (AKA G5) and POWER5
(10-135%). But I noticed anomalies in the round and trunc results. This depends
on the final alignment of the code and in some cases changes the dispatch
groupings resulting in branch miss-predictions. So for PPC64 I replace the ENTER
macro with EALIGN (?, 4, 0) which force quadword alignment. This improved the
results. I did not do this for powerpc32 because I don't have access to
powerpc32 machines to test on.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED


http://sources.redhat.com/bugzilla/show_bug.cgi?id=602

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]