This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug libc/15744] New: strtod is incorrect on INF/inf case variations in tr_TR.iso88599 locale


http://sourceware.org/bugzilla/show_bug.cgi?id=15744

            Bug ID: 15744
           Summary: strtod is incorrect on INF/inf case variations in
                    tr_TR.iso88599 locale
           Product: glibc
           Version: 2.17
            Status: NEW
          Severity: normal
          Priority: P2
         Component: libc
          Assignee: unassigned at sourceware dot org
          Reporter: vincent-srcware at vinc17 dot net
                CC: drepper.fsp at gmail dot com

strtod doesn't recognize some "INF" and "inf" case variations in the
tr_TR.iso88599 locale, e.g. "Änf" and "ÄNF", due to the dotless "i" and the "I"
with dot in Turkish. The C99 standard says (and it seems that this hasn't be
modified in C11):

    7.20.1.3 The strtod, strtof, and strtold functions
[...]
 3  The expected form of the subject sequence is an optional plus or minus
    sign, then one of the following:
    â a nonempty sequence of decimal digits optionally containing a
      decimal-point character, then an optional exponent part as defined
      in 6.4.4.2;
    â a 0x or 0X, then a nonempty sequence of hexadecimal digits
      optionally containing a decimal-point character, then an optional
      binary exponent part as defined in 6.4.4.2;
    â INF or INFINITY, ignoring case
    â NAN or NAN(n-char-sequence_opt), ignoring case in the NAN part,
[...]
 5  In other than the "C" locale, additional locale-specific subject
    sequence forms may be accepted.

Note that since strtod is locale sensitive (as confirmed in the comments of bug
7021), the "ignoring case" needs to take the current locale into account.

Testcase:

#include <stdio.h>
#include <stdlib.h>
#include <locale.h>
#include <ctype.h>
#include <string.h>
#include <strings.h>

#define STRINGIFY(S) #S
#define MAKE_STR(S) STRINGIFY(S)

#define NUMD 1.67111104753282335
#define NUMS MAKE_STR(NUMD)

void stest (const char *buffer, const char *prefix)
{
  char *endptr;
  double x;

  x = strtod (buffer, &endptr);
  printf ("  %sstrtod(\"%s\",endptr) = %f%s\n", prefix,
          buffer, x, *endptr ? " (endptr error)" : "");
}

int main (int argc, char **argv)
{
  int i, j, k;
  double d = NUMD;
  double inf = 1.0 / 0.0;
  char *infs[] = { "INF", "inf" };
  char buffer[64], prefix[6];

  if (setlocale (LC_ALL, "") == NULL)
    {
      fprintf (stderr, "locale-test: can't set locales\n");
      exit (EXIT_FAILURE);
    }

  printf ("* Output:\n  d = " NUMS " (string)\n");
  printf ("  d = %f (in decimal, %%f format)\n", d);
  printf ("  Infinity: %f\n", inf);
  printf ("* Input:\n");
  strcpy (buffer, NUMS);
  stest (buffer, "");
  sprintf (buffer, "%f", d);
  stest (buffer, "");
  for (i = 0; i < 2; i++)
    for (j = 0; j < 4; j++)
      {
        for (k = 0; k < 3; k++)
          {
            buffer[k] = infs[i][k];
            if (j > k)
              buffer[k] = (i ? toupper : tolower)(buffer[k]);
          }
        buffer[3] = '\0';
        sprintf (prefix, "[%d%d] ",
                 !strcasecmp (buffer, "INF"),
                 !strcasecmp (buffer, "inf"));
        stest (buffer, prefix);
      }

  for (i = 1; i < argc; i++)
    {
      double x;
      char *end;

      x = strtod (argv[i], &end);
      printf ("Argument %d: ", i);
      if (*end == '\0')
        printf ("%.17g\n", x);
      else
        printf ("error\n");
    }

  return 0;
}

It can be tested by running it under the tr_TR.iso88599 locale directly (if the
terminal uses the associated fonts), or with:

  LC_ALL=tr_TR.iso88599 ./locale-test | iconv -f iso88599

from any UTF8-based locales. I get:

* Output:
  d = 1.67111104753282335 (string)
  d = 1,671111 (in decimal, %f format)
  Infinity: inf
* Input:
  strtod("1.67111104753282335",endptr) = 1,000000 (endptr error)
  strtod("1,671111",endptr) = 1,671111
  [11] strtod("INF",endptr) = inf
  [00] strtod("ÄNF",endptr) = 0,000000 (endptr error)
  [00] strtod("ÄnF",endptr) = 0,000000 (endptr error)
  [00] strtod("Änf",endptr) = 0,000000 (endptr error)
  [11] strtod("inf",endptr) = inf
  [00] strtod("Änf",endptr) = 0,000000 (endptr error)
  [00] strtod("ÄNf",endptr) = 0,000000 (endptr error)
  [00] strtod("ÄNF",endptr) = 0,000000 (endptr error)

The endptr error for "1.67111104753282335" is normal: with the following line,
it shows that strtod is locale sensitive, as expected. Then the following 8
lines should all succeed, which is not the case here.

Note that the C standard requires to recognize "INF", ignoring case, but says
nothing about "inf". However the IEEE 754-2008 standard mentions "inf"
(ignoring case) and glibc uses "inf" as the output, so that it should also be
recognized, ignoring case.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]