This is the mail archive of the
libc-locales@sources.redhat.com
mailing list for the GNU libc locales project.
Re: Several bugs in glibc locale implementation?
- From: mjn3 at codepoet dot org (Manuel Novoa III)
- To: Behdad Esfahbod <behdad at cs dot toronto dot edu>
- Cc: Petter Reinholdtsen <pere at hungry dot com>,Hamed Hatami <hamed at cs dot toronto dot edu>, libc-locales at sources dot redhat dot com
- Date: Wed, 26 May 2004 19:18:12 -0600
- Subject: Re: Several bugs in glibc locale implementation?
- References: <E1BSCGR-0003by-00@minerva.hungry.com> <Pine.LNX.4.58.0405261952530.19731@epoch.cs>
Hello,
On Wed, May 26, 2004 at 07:57:57PM -0400, Behdad Esfahbod wrote:
> Hi,
>
> I just got the time to test a few of them:
>
> unget-putc-segfault.c: already fixed.
> ftell-wideio.c: already fixed.
>
> collation-narrow-wide-bug.c: looks like a bug.
> collation-undefined.c: I can still reproduce that, but not
> sure if it's a bug or not.
>
> locale-initialization-bug.c: is a bug.
>
> Remains the printf and scanf stuff. Hamed, can you look at
> printf-*, scanf-*, and locale-initialization-bug.c please? The
> later uses fa_IR, and the rest should be related to your recent
> work on them.
Some of the scanf bugs in that post were uClibc bugs which were fixed.
Some were glibc bugs which were also fixed. Unfortunately, even after
quoting form the standards, from official responses to defect reports,
and supplying illustrative tests, Ulrich Drepper refused to even
acknowledge subsequent posts and I eventually gave up.
I have not looked at or tested against glibc since. However, here's
the list of issues I was aware of at the time and that I added to
our glibc vs uClibc differences file. Again, some may have been
fixed since. I'd suggest browsing the mailing list for some of my
posts during that time. You'll find additional rational and tests.
Manuel
glibc bugs that Ulrich Drepper has refused to acknowledge or comment on
( http://sources.redhat.com/ml/libc-alpha/2003-09/ )
-----------------------------------------------------------------------
1) The C99 standard says that for printf, a %s conversion makes no special
provisions for multibyte characters. SUSv3 is even more clear, stating
that bytes are written and a specified precision is in bytes. Yet glibc
treats the arg as a multibyte string when a precision is specified and
not otherwise.
2) Both C99 and C89 state that the %c conversion for scanf reads the exact
number of bytes specified by the optional field width (or 1 if not specified).
uClibc complies with the standard. There is an argument that perhaps the
specified width should be treated as an upper bound, based on some historical
use. However, such behavior should be mentioned in the Conformance document.
3) glibc's scanf is broken regarding some numeric patterns. Some invalid
strings are accepted as valid ("0x.p", "1e", digit grouped strings).
In spite of my posting examples clearly illustrating the bugs, they remain
unacknowledged by the glibc developers.
4) glibc's scanf seems to require a 'p' exponent for hexadecimal float strings.
According to the standard, this is optional.
5) C99 requires that once an EOF is encountered, the stream should be treated
as if at end-of-file even if more data becomes available. Further reading
can be attempted by clearing the EOF flag though, via clearerr() or a file
positioning function. For details concerning the original change, see
Defect Report #141. glibc is currently non-compliant, and the developers
did not comment when I asked for their official position on this issue.
6) glibc's collation routines and/or localedef are broken regarding implicit
and explicit UNDEFINED rules.