This is the mail archive of the cygwin mailing list for the Cygwin project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: possible compiler optimization error

> From: On Behalf Of Brian Dessent
> Sent: Thursday, June 28, 2007 3:02 PM
> To:
> Subject: Re: possible compiler optimization error
> I think Dave already explained it but in case it's not clear, on the
> i387, all floating point math happens at 80 bit registers, even if the
> underlying values are actually 32 bit (float) or 64 bit (double)
> quantities.  This means there can be extra bits of precision in the
> register if the value has not been written to memory yet.  
> -ffloat-store
> is kind of a hacky workaround to this problem that tells the 
> compiler to
> try harder to write values to memory and read them back in whenever
> possible.  It's not a guaranteed fix, and it has a negative 
> performance
> hit.
> The real problem is not in the compiler, it's the crappy design of the
> i387.  The best workaround is not to use the 387 unit at all if
> possible.  This is what -mfpmath=sse does, as the sse unit 
> was designed
> much more sanely so that it doesn't have this excess 
> precision problem.
> Note that sse only has support for 32 bit floating point 
> types, you need
> sse2 for 64 bit double types.  And -march=i686 does not enable sse2
> because not all i686 class machines have sse2.  So that is why I said
> "if you have a sse2 machine and set -march appropriately", 
> meaning e.g.
> -march=pentium4 or -march=k8.  That is why using "-march=i686" or
> "-march=i686 -msse" both fail, because neither imply sse2.
> Using "-march=i686 -msse2" doesn't make a lot of sense to me, 
> because it
> generates code that will cause invalid instruction faults on i686
> machines without sse2 (e.g.  ppro, celeron, pentium3, k7/athlon.)  By
> giving -msse2 you're already limiting the architecture to pentium4/k8
> anyway, so you might as well just use the correct -march.
> This is all thankfully moot on x86_64, because there the 387 is
> obsoleted and essentially disabled entirely.

This is all very good information.  Thank you all very much.
I was just reading
linked to by another posting on here.
Much like you say that -ffloat-store is a hacky workaround, on that bug
report it is said that -ffloat-store "may trigger instead of suppressing
the bug".

My using -march=i686 was because I couldn't find a list of all accepted
values in the man page for gcc.  After some googling I found that I can
use -march=pentium-m for my Dell D600 Laptop.  I am now happy to report
that setting -march=pentium-m -O2 works fine.  I am glad to hear that
using the sse2 correctly solves the problem without having to use
-ffloat-store and taking a possible performance hit.

I should also mention that the Solaris machine I was using is a SPARC
and the Linux machine I was using is an Opteron.
It would be interesting to load SolarisX86 or Linux on the same Windows
laptop just to prove that it is the hardware.

Unsubscribe info:
Problem reports:

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]