This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: gcc -ffast-math defect with tan(x)


Eric Backus wrote:

> One experiment that I did, which confused me more than anything else, is 
> replace the calls to tan() with calls to log() (and change all the 0.0 values 
> to something OK for log() like 1.0).  The generated assembly code appears to 
> be identical except that _f_tan is replaced by _f_log, but the program works 
> correctly.  That would mean that the generated assembly code is correct, and 
> the defect is in _f_tan?

  Something I don't understand is going on in the x87 fpu.  Your STC again:

int main(void)
{
    double d1 = 0.0;
    double d2 = 0.0;
    d1 = tan(d1);
    d2 = tan(d2);
    (void) printf("d1 = %lg, expecting 0 (or -0)\n", d1);
    (void) printf("d2 = %lg, expecting 0 (or -0)\n", d2);
    return 0;
}

  In this code, _f_tan is called twice, with a value of zero each time.  But
it behaves differently the second time.  The _f_tan code in assembly looks like:

> (gdb) disass 0x610ea500
> Dump of assembler code for function _f_tan:
> 0x610ea500 <_f_tan+0>:  push   %ebp
> 0x610ea501 <_f_tan+1>:  mov    %esp,%ebp
> 0x610ea503 <_f_tan+3>:  fldl   0x8(%ebp)
> 0x610ea506 <_f_tan+6>:  fptan
> 0x610ea508 <_f_tan+8>:  fincstp
> 0x610ea50a <_f_tan+10>: leave
> 0x610ea50b <_f_tan+11>: ret
> End of assembler dump.
> (gdb)

  Here's the first run through:

> Breakpoint 2, 0x004011e8 in _f_tan ()
> _f_tan () at /gnu/winsup/src/newlib/libm/machine/i386/f_tan.S:28
> 28		pushl ebp
> Current language:  auto; currently asm
> 29		movl esp,ebp
> 30		fldl 8(ebp)

This is the fp state just before executing the fldl above:

>   R7: Empty   0x00000000000000000000
>   R6: Empty   0x00016106e7800022c548
>   R5: Empty   0xd8021027c0001027bff8
>   R4: Empty   0xd8d0611dae800022bf20
>   R3: Empty   0x0c656120a86000000000
>   R2: Empty   0x001e0000007700000042
>   R1: Empty   0xc52c0022c530001a75de
> =>R0: Empty   0x170800ce00cc507c0000
> 
> Status Word:         0xffff0000                                            
>                        TOP: 0
> Control Word:        0xffff037f   IM DM ZM OM UM PM
>                        PC: Extended Precision (64-bits)
>                        RC: Round to nearest
> Tag Word:            0xffffffff
> Instruction Pointer: 0x00:0x00000000
> Operand Pointer:     0xffff0000:0x00000000
> Opcode:              0x0000
> 31		fptan

This is the fp state just before executing the fptan.  Zero has been loaded
into r7 which is the current top-of-stack a.k.a. st(0):

> =>R7: Zero    0x00000000000000000000 +0                         
>   R6: Empty   0x00016106e7800022c548
>   R5: Empty   0xd8021027c0001027bff8
>   R4: Empty   0xd8d0611dae800022bf20
>   R3: Empty   0x0c656120a86000000000
>   R2: Empty   0x001e0000007700000042
>   R1: Empty   0xc52c0022c530001a75de
>   R0: Empty   0x170800ce00cc507c0000
> 
> Status Word:         0xffff3800                                            
>                        TOP: 7
> Control Word:        0xffff037f   IM DM ZM OM UM PM
>                        PC: Extended Precision (64-bits)
>                        RC: Round to nearest
> Tag Word:            0xffff7fff
> Instruction Pointer: 0x1b:0x0040a8d2
> Operand Pointer:     0xffff0023:0x0175ffa0
> Opcode:              0xdf7d
> 32		fincstp

  This is the fp state immediately after the fptan and before the fincstp.  It
has loaded zero (= tan 0.0) into r7 and pushed a constant +1 (as is the
documented behaviour of fptan):

>   R7: Zero    0x00000000000000000000 +0                         
> =>R6: Valid   0x3fff8000000000000000 +1                         
>   R5: Empty   0xd8021027c0001027bff8
>   R4: Empty   0xd8d0611dae800022bf20
>   R3: Empty   0x0c656120a86000000000
>   R2: Empty   0x001e0000007700000042
>   R1: Empty   0xc52c0022c530001a75de
>   R0: Empty   0x170800ce00cc507c0000
> 
> Status Word:         0xffff3000                                            
>                        TOP: 6
> Control Word:        0xffff037f   IM DM ZM OM UM PM
>                        PC: Extended Precision (64-bits)
>                        RC: Round to nearest
> Tag Word:            0xffff4fff
> Instruction Pointer: 0x1b:0x00a50c88
> Operand Pointer:     0xffff0023:0x03f6f5c8
> Opcode:              0xddd8
> 34		leave

  FP stack pointer has been incremented and we return the result in ST(0):

> =>R7: Zero    0x00000000000000000000 +0                         
>   R6: Valid   0x3fff8000000000000000 +1                         
>   R5: Empty   0xd8021027c0001027bff8
>   R4: Empty   0xd8d0611dae800022bf20
>   R3: Empty   0x0c656120a86000000000
>   R2: Empty   0x001e0000007700000042
>   R1: Empty   0xc52c0022c530001a75de
>   R0: Empty   0x170800ce00cc507c0000
> 
> Status Word:         0xffff3800                                            
>                        TOP: 7
> Control Word:        0xffff037f   IM DM ZM OM UM PM
>                        PC: Extended Precision (64-bits)
>                        RC: Round to nearest
> Tag Word:            0xffff4fff
> Instruction Pointer: 0x00:0x00000000
> Operand Pointer:     0xffff0000:0x00000000
> Opcode:              0x0000
> Undefined command: "".  Try "help".

  So far so good.  Then it comes to the second execution:


> Breakpoint 2, 0x004011e8 in _f_tan ()
> _f_tan () at /gnu/winsup/src/newlib/libm/machine/i386/f_tan.S:28
> 28		pushl ebp
> 29		movl esp,ebp
> 30		fldl 8(ebp)

  State before the fldl, as before.

>   R7: Empty   0x00000000000000000000
>   R6: Valid   0x3fff8000000000000000 +1                         
>   R5: Empty   0xd8021027c0001027bff8
>   R4: Empty   0xd8d0611dae800022bf20
>   R3: Empty   0x0c656120a86000000000
>   R2: Empty   0x001e0000007700000042
>   R1: Empty   0xc52c0022c530001a75de
> =>R0: Empty   0x170800ce00cc507c0000
> 
> Status Word:         0xffff0000                                            
>                        TOP: 0
> Control Word:        0xffff037f   IM DM ZM OM UM PM
>                        PC: Extended Precision (64-bits)
>                        RC: Round to nearest
> Tag Word:            0xffffcfff
> Instruction Pointer: 0x1b:0x00483622
> Operand Pointer:     0xffff0023:0x0012e7d8
> Opcode:              0xdd1c
> 31		fptan

  State after fldl, before fptan: r7 correctly loaded with zero, as before.

> =>R7: Zero    0x00000000000000000000 +0                         
>   R6: Valid   0x3fff8000000000000000 +1                         
>   R5: Empty   0xd8021027c0001027bff8
>   R4: Empty   0xd8d0611dae800022bf20
>   R3: Empty   0x0c656120a86000000000
>   R2: Empty   0x001e0000007700000042
>   R1: Empty   0xc52c0022c530001a75de
>   R0: Empty   0x170800ce00cc507c0000
> 
> Status Word:         0xffff3800                                            
>                        TOP: 7
> Control Word:        0xffff037f   IM DM ZM OM UM PM
>                        PC: Extended Precision (64-bits)
>                        RC: Round to nearest
> Tag Word:            0xffff4fff
> Instruction Pointer: 0x1b:0x00a50c88
> Operand Pointer:     0xffff0023:0x03f6f5c8
> Opcode:              0xddd8
> 32		fincstp

  WTF?  The fptan has returned two QNaNs for no apparent reason?

>   R7: Special 0xffffc000000000000000 Real Indefinite (QNaN)
> =>R6: Special 0xffffc000000000000000 Real Indefinite (QNaN)
>   R5: Empty   0xd8021027c0001027bff8
>   R4: Empty   0xd8d0611dae800022bf20
>   R3: Empty   0x0c656120a86000000000
>   R2: Empty   0x001e0000007700000042
>   R1: Empty   0xc52c0022c530001a75de
>   R0: Empty   0x170800ce00cc507c0000
> 
> Status Word:         0xffff3241   IE                       SF      C1      
>                        TOP: 6
> Control Word:        0xffff037f   IM DM ZM OM UM PM
>                        PC: Extended Precision (64-bits)
>                        RC: Round to nearest
> Tag Word:            0xffffafff
> Instruction Pointer: 0x1b:0x007c74c7
> Operand Pointer:     0xffff0023:0x03e6fd5c
> Opcode:              0xddd8
> 34		leave

  ... and we increment the stack pointer and return the top one as the
function's result.

> =>R7: Special 0xffffc000000000000000 Real Indefinite (QNaN)
>   R6: Special 0xffffc000000000000000 Real Indefinite (QNaN)
>   R5: Empty   0xd8021027c0001027bff8
>   R4: Empty   0xd8d0611dae800022bf20
>   R3: Empty   0x0c656120a86000000000
>   R2: Empty   0x001e0000007700000042
>   R1: Empty   0xc52c0022c530001a75de
>   R0: Empty   0x170800ce00cc507c0000
> 
> Status Word:         0xffff3841   IE                       SF              
>                        TOP: 7
> Control Word:        0xffff037f   IM DM ZM OM UM PM
>                        PC: Extended Precision (64-bits)
>                        RC: Round to nearest
> Tag Word:            0xffffafff
> Instruction Pointer: 0x1b:0x004042b1
> Operand Pointer:     0xffff0023:0x01490518
> Opcode:              0xd95f
> Continuing.
> 
> Program exited normally.

  Hmm.  I think the C1 indicates it believes there has been a stack underflow,
and maybe that happens because the r6 slot is valid rather than empty the
second time round; maybe _f_tan needs to be 'popping' (or in some way marking
invalid) that unused +1.0 constant rather than just skipping the stack pointer
over it.  I'll see if that makes a difference; I'm not an x87 specialist, I
only know just enough to get by.  I see that the fincstp documentation does
warn that "this operation is not equivalent to popping the stack", so I may be
on the right track.

    cheers,
      DaveK



--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]