possible compiler optimization error

Mike Marchywka marchywka@hotmail.com
Thu Jun 28 19:24:00 GMT 2007



This doesn't have anything to do with cygwin but it can be an important 
point.
Some compilers or applications , I think Intel IIRC, can figure out which 
processor you have
at run time and pick which code to run- obviously the exe size gets large 
but
if you need speed it can be helpful. I've thrown in assembly code that needs
certain fpu's and its great as long as you have a fall back or diagnostics 
and don't
fail with an unexplained invalid instruction of "core dump."

Close only counts in horseshoes, handgrenades, and floating point.





>From: Brian Dessent <brian@dessent.net>
>Reply-To: cygwin@cygwin.com
>To: cygwin@cygwin.com
>Subject: Re: possible compiler optimization error
>Date: Thu, 28 Jun 2007 12:01:31 -0700
>
>"Frederich, Eric P21322" wrote:
>
> > I do realize that they may in fact differ way out there beyond 15
> > decimal places.
> > What I don't understand is how two numbers pass a ==, then fail a >=,
> > then pass a >= unless (after compiler optimizations) the second and
> > third comparisons are actually comparing copies of these numbers which
> > aren't "bit-exact" copies.
> > Is this what you're saying might be happening and what -ffloat-store is
> > supposed to resolve?
> > If so, that makes sense and I can accept that.
>
>I think Dave already explained it but in case it's not clear, on the
>i387, all floating point math happens at 80 bit registers, even if the
>underlying values are actually 32 bit (float) or 64 bit (double)
>quantities.  This means there can be extra bits of precision in the
>register if the value has not been written to memory yet.  -ffloat-store
>is kind of a hacky workaround to this problem that tells the compiler to
>try harder to write values to memory and read them back in whenever
>possible.  It's not a guaranteed fix, and it has a negative performance
>hit.
>
>The real problem is not in the compiler, it's the crappy design of the
>i387.  The best workaround is not to use the 387 unit at all if
>possible.  This is what -mfpmath=sse does, as the sse unit was designed
>much more sanely so that it doesn't have this excess precision problem.
>
>Note that sse only has support for 32 bit floating point types, you need
>sse2 for 64 bit double types.  And -march=i686 does not enable sse2
>because not all i686 class machines have sse2.  So that is why I said
>"if you have a sse2 machine and set -march appropriately", meaning e.g.
>-march=pentium4 or -march=k8.  That is why using "-march=i686" or
>"-march=i686 -msse" both fail, because neither imply sse2.
>
>Using "-march=i686 -msse2" doesn't make a lot of sense to me, because it
>generates code that will cause invalid instruction faults on i686
>machines without sse2 (e.g.  ppro, celeron, pentium3, k7/athlon.)  By
>giving -msse2 you're already limiting the architecture to pentium4/k8
>anyway, so you might as well just use the correct -march.
>
>This is all thankfully moot on x86_64, because there the 387 is
>obsoleted and essentially disabled entirely.
>
>Brian
>
>--
>Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
>Problem reports:       http://cygwin.com/problems.html
>Documentation:         http://cygwin.com/docs.html
>FAQ:                   http://cygwin.com/faq/
>

_________________________________________________________________
PC Magazine’s 2007 editors’ choice for best Web mail—award-winning Windows 
Live Hotmail. 
http://imagine-windowslive.com/hotmail/?locale=en-us&ocid=TXT_TAGHM_migration_HM_mini_pcmag_0507


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/



More information about the Cygwin mailing list