This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Fix newly exposed bug [was RE: RFC: Fix partial NaN-parsing problem [was RE: sscanf problem]]


----Original Message----
>From: Jean-Christophe Kablitz
>Sent: 27 April 2005 00:22

> Hello,
> 
> I have noticed, that, while parsing {a float_value immediately followed by
> 'n' or 'N'} with the "%f%c" format, the sscanf function of cygwin-1.5.16-1
> behaves differently from the scanf function of cygwin-1.5.14-1.
> Until cygwin-1.5.14-1 (included), 'n' matches %c, while with
cygwin-1.5.15-1 
> and cygwin-1.5-16-1, 'n' is no more assigned to %c.
> 
> In the following test case, I would expect the progran to output
> i=2 x=1 m=a
> i=2 x=1 m=n
> 
> that was the case until cygwin-1.5.14-1 (included).
> 
> With cygwin-1.5.15-1 and cygwin-1.5-16-1, the program outputs instead
> i=2 x=1 m=a
> i=1 x=1 m=_
> 
> Maybe I have been misusing sscanf. Or there is a relationship with the
> NaN-parsing problem of the "newlib".

  No, your use of sscanf is perfectly correct!  Yes, there is a newly
exposed bug in the NaN parsing code, as you guessed; it falsely accepted the
N as part of 'NaN'.  Then, because it had begun by parsing a number, and
because it successfully parsed the number, it didn't go through the
'nan-parsing-has-failed-so-put-back-the-eaten-chars' bit that my last fix
introduced.

> --- beginning of test case ---
> jck:/sscanf> cat ssn.c
> #include <stdio.h>
> 
> int main()
> {
>     double x;
>     char   m;
>     int    i;
> 
>     x = 0.0;
>     m = '_';
>     i = sscanf("1.0a", "%lf%c", &x, &m);
>     printf("i=%d x=%g m=%c\n", i, x, m);
>     x = 0.0;
>     m = '_';
>     i = sscanf("1.0n", "%lf%c", &x, &m);
>     printf("i=%d x=%g m=%c\n", i, x, m);
>     return 0;
> }
> 
> jck:/sscanf> gcc -O0 ssn.c -o ssn.exe
> jck:/sscanf> ./ssn.exe
> i=2 x=1 m=a
> i=1 x=1 m=_
> --- end of test case ---

  Thank you for the simple test case; I was able to reproduce the problem
easily, although not exactly: the output I got was:

dk@mace /test/sscanf> ./ssn.exe
i=2 x=1 m=a
i=0 x=0 m=_

  It turns out there has been an underlying bug that was exposed with my
earlier fix.  The problem is in /src/newlib/libc/stdio/vfscanf.c, function
__SVFSCANF_R, case CT_FLOAT, where it's parsing a float and sees an 'n':

		case 'n':
		case 'N':
	          if (nancount == 0
		      && (flags & (SIGNOK | NDIGITS | DPTOK | EXPOK)))
		    {
		      flags &= ~(SIGNOK | DPTOK | EXPOK | NDIGITS);
		      nancount = 1;
		      goto fok;
		    }
		  else if (nancount == 2)
		    {
		      nancount = 3;
		      goto fok;
		    }
		  break;

  The condition at the top of the loop is meant to be testing to ensure we
haven't already parsed any of the other possible components of an FP number,
but what it actually tests is whether or not we've parsed *all* the other
possible components; that's the only case it'll refuse to accept an 'n' at
present.  The reason it used to work is because after bogusly parsing the
'n', the old version then hits this bit of code when it comes time to parse
the %c field (CT_CHAR):

	case CT_CHAR:
	  /* scan arbitrary characters (sets NOSKIP) */
	  if (width == 0)
	    width = 1;

  I don't understand what this is doing, but it looks like some kind of
kludge that's saying "If we got here, then we know there must have been a
char to parse, so if we don't have any, we must have bogusly consumed it
already, so pretend it's there anyway".  Or something; like I say, I don't
understand it, but it looks like a kludge to me.

  Anyway, the attached patch changes the bitwise-AND (&) to an equality (==)
operator, which genuinely tests that we haven't parsed anything else at all;
it's effectively verifying that the flags haven't changed from their initial
value before beginning to attempt to parse the possible 'NaN' string.  This
fixes the testcase for me: I now see

dk@mace /test/sscanf> ./ssn.exe
i=2 x=1 m=a
i=2 x=1 m=n

and indeed, with an expanded version of it, which also verifies the amount
of characters consumed, I see:

dk@mace /test/sscanf> cat ssn.c
#include <stdio.h>
int main()
{
    double x;
    char   m;
    int    i, n;

    x = 0.0;
    m = '_';
    n = -1;
    i = sscanf("1.0a", "%lf%c%n", &x, &m, &n);
    printf("i=%d x=%g m=%c n=%d\n", i, x, m, n);
    x = 0.0;
    m = '_';
    n = -1;
    i = sscanf("1.0n", "%lf%c%n", &x, &m, &n);
    printf("i=%d x=%g m=%c n=%d\n", i, x, m, n);
    x = 0.0;
    m = '_';
    n = -1;
    i = sscanf("1.0na", "%lf%c%n", &x, &m, &n);
    printf("i=%d x=%g m=%c n=%d\n", i, x, m, n);
    x = 0.0;
    m = '_';
    n = -1;
    i = sscanf("1.0nan", "%lf%c%n", &x, &m, &n);
    printf("i=%d x=%g m=%c n=%d\n", i, x, m, n);
    x = 0.0;
    m = '_';
    n = -1;
    i = sscanf("1.0e", "%lf%c%n", &x, &m, &n);
    printf("i=%d x=%g m=%c n=%d\n", i, x, m, n);
    x = 0.0;
    m = '_';
    n = -1;
    i = sscanf("1.0f", "%lf%c%n", &x, &m, &n);
    printf("i=%d x=%g m=%c n=%d\n", i, x, m, n);
    return 0;
}

dk@mace /test/sscanf> gcc -ggdb -O0 ssn.c -o ssn
dk@mace /test/sscanf> ./ssn.exe
i=2 x=1 m=a n=4
i=2 x=1 m=n n=4
i=2 x=1 m=n n=4
i=2 x=1 m=n n=4
i=2 x=1 m=e n=4
i=2 x=1 m=f n=4

so I reckon it's all good now.  Jean-Cristophe, you might want to try this
patch locally, or you can wait for it to be incorporated into newlib, at
which point it will start to show up in cygwin developer snapshots, or you
could just wait for the next cygwin dll release.

    cheers,
      DaveK
-- 
Can't think of a witty .sigline today....

Attachment: exposed-nan-bug.diff
Description: Binary data

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]