This is the mail archive of the cygwin mailing list for the Cygwin project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: possible compiler optimization error

On 28 June 2007 19:28, Frederich, Eric P21322 wrote:

>> From: On Behalf Of Brian Dessent
>> Sent: Thursday, June 28, 2007 1:53 PM
>> To:
>> Subject: Re: possible compiler optimization error
> Thanks for looking at it.  I am in unfamiliar water here.
>> Try with -ffloat-store.  Or if you have a sse2 capable
>> machine, set the
>> appropriate -march= and use -mfpmath=sse.  Both of these attempt to
>> bypass problems caused by the excess precision of 80 bit double on
>> i387.  If they fix the problem, it's a bug in your code, not
>> anything to do with the compiler.
> -mfpmath=sse didn't work but -msse did.  Here are some new findings...
> -ffloat-store     -O2  passes
> -march=i686       -O2  fails
> -march=i686 -sse  -O2  fails
> -march=i686 -sse2 -O2  passes

  This does look an awful lot like now.

> What I don't understand is how two numbers pass a ==, then fail a >=,
> then pass a >= unless (after compiler optimizations) the second and
> third comparisons are actually comparing copies of these numbers which
> aren't "bit-exact" copies.
> Is this what you're saying might be happening and what -ffloat-store is
> supposed to resolve?

  Yep, that's exactly what's happening, and the reason why is that when these
values are live in FPU registers, they have 80-bit precision, which is more
than they're supposed to and has knock-on effects.  In other parts of the
code, where the compiler doesn't have enough FPU regs and has had to spill
some of them to stack variables, they get cut back to the regular 53-bit
precision which make the maths functions work properly again.  When the
compiler feels it can keep a variable's value in a register, and when it feels
it's short on registers and has to spill the value to memory, is something
that varies very greatly according to optimisation options.

>> If you want a definitive answer then you need to provide a standalone
>> testcase that compiles.  Sample code taken out of context
>> that can't be compiled is significantly less useful.
> I really want to but it is a huge program and I am afraid that if I
> create a chopped down example I can't guarantee that the same
> optimizations will happen.

  You can't, but it's often worth trying.

  Another thing you could try doing is adding the following code (taken from
comment #60 to PR323) to your program, and call it first thing you do in

#define _FPU_SETCW(cw) __asm__ ("fldcw %0" : : "m" (*&cw))

void set_math_double_precision() {
  fpu_control_t fpu_control = 0x027f ;

  This will set your FPU so it uses the same (53-bit) mantissa size in
registers as it does in memory, thereby avoiding the discrepancies.

Can't think of a witty .sigline today....

Unsubscribe info:
Problem reports:

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]