This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v2.0] Use saturated arithmetic for overflow detection.


On Fri, Nov 01, 2013 at 10:19:25AM -0700, Paul Eggert wrote:
> Thanks for looking into this.
> 
> I agree with earlier comments that we care about overall
> performance not just the individual ops, that optimizing
> for code size is probably best unless we get significant numbers
> suggesting otherwise, and that it may be time to ask the GCC folks
> for help with fast saturated arithmetic ops.  Some other suggestions:
> 
I already asked. But adding them will take time and benefits from
builtin are small. When 

> Stick with inline functions not macros, and use lower-case names since they're
> functions.
>
 
> If you like tuning this stuff you might want to look at
> <http://locklessinc.com/articles/sat_arithmetic/>, which
> shows how to do saturated arithmetic without jumps, both portably
> and on x86-64; I don't know whether this will save code space, though.

On sandy bridge my implementation runs

real	0m0.548s
user	0m0.548s
sys	0m0.000s

when I replace multiplication from one in article it is slower.

real	0m0.599s
user	0m0.599s
sys	0m0.000s

I got similar slowdown on core2, nehalem and fx10 machines. 

This article is example of missapplying rule of eliminating branches
when possible. This holds only when branch is misspredicted at least 5%
of time. See http://yarchive.net/comp/linux/cmov.html

As code size is concerned my assembly has 8 extra bytes (jump 2, xor 3, neg 3).
When I use sbb trick from article I could decrease that to 5.

A article code is 6 bytes per instruction (sbb 3, or 3)

A comparison is 10 bytes (7 mov constant,reg 3 cmp).


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]