This is the mail archive of the
libc-help@sourceware.org
mailing list for the glibc project.
Re: strtoul(-1) discussion
- From: Fabrice Bauzac <libnoon at gmail dot com>
- To: Florian Weimer <fweimer at redhat dot com>
- Cc: libc-help at sourceware dot org
- Date: Mon, 15 Apr 2013 17:30:26 +0200
- Subject: Re: strtoul(-1) discussion
- References: <CAB6Q1a-m=W4cA3WCeuB1FzFWHbistLrN3wR1hfo4mfPB9j9qJw at mail dot gmail dot com> <516C0EA1 dot 6050001 at redhat dot com>
2013/4/15 Florian Weimer <fweimer@redhat.com>:
> On 04/07/2013 03:20 PM, Fabrice Bauzac wrote:
> POSIX 2008 also says: "If the subject sequence begins with a minus-sign, the
> value resulting from the conversion shall be negated." I think that means
> that POSIX actually mandates the (admittedly weird) glibc behavior. The C
> and POSIX committees probably couldn't bring themselves to clearly specify
> this behavior, and that's where the ambiguity comes from.
I have another interpretation. To me, the "value" here is the
mathematical, converted value, which is not necessarily a value
representable in the target C type (see RETURN VALUE section in [1]).
If there is a minus sign, then the mathematical value is negated.
Then, if that mathematical value does not fit in the target C type (be
it because the math. value is too large or because it is negative
while the type is unsigned), there would be overflow.
"-0" is accepted by strtoul: it is desirable for strtoul (or strtol)
to accept the minus sign, but I think POSIX just says that if the
value does not fit, be it strtol or strtoul, then it should report
overflow.
> It's even clearer for the %u conversion specifier, where this behavior is
> pretty explicitly mandated.
Could you please elaborate a little bit: are you talking about printf,
scanf? Which specification? Thanks!
> That's actually what I expect—GNU libc's behavior matches that of legacy
> systems.
> There are some other non-GNU resources which describe the odd behavior,
Maybe there was an initial early public-domain implementation, which
was then copied to many UNIX variants incl. BSD, Solaris, GNU? Would
that be possible?
However, I think I remember that the AIX libc implements the "normal"
behaviour of refusing "-1".
> There's an old discussion on comp.std.c dating back to September 1992,
> "strtoul on negative numbers". It strongly suggests that back then,
> everyone implemented the strange behavior.
I think I have found the discussion:
https://groups.google.com/forum/?fromgroups#!topic/comp.std.c/8LHELp65IqI
My current opinion is that the "sensible" meaning of the specification
(the one anybody would expect from learning that strtoul is meant to
convert a string to an unsigned long) was not (and has never been)
specified sufficiently clearly and led to the BSD/Solaris/GNU
implementations...
Thanks!
Best regards
Fabrice
2013/4/15 Florian Weimer <fweimer@redhat.com>:
> On 04/07/2013 03:20 PM, Fabrice Bauzac wrote:
>>
>> The algorithm behind GNU libc's strtoul seems to be:
>>
>> * store the sign independently from the digits.
>> * perform a strtoul on the digits without the sign, producing ulong
>> value N.
>> * if the sign is negative, perform N := -N.
>> * return N.
>>
>> The problem is that [1] says:
>>
>> "If the correct value is outside the range of representable values,
>> {ULONG_MAX} or {ULLONG_MAX} shall be returned and errno set to
>> [ERANGE]."
>
>
>
> POSIX 2008 also says: "If the subject sequence begins with a minus-sign, the
> value resulting from the conversion shall be negated." I think that means
> that POSIX actually mandates the (admittedly weird) glibc behavior. The C
> and POSIX committees probably couldn't bring themselves to clearly specify
> this behavior, and that's where the ambiguity comes from.
>
> It's even clearer for the %u conversion specifier, where this behavior is
> pretty explicitly mandated.
>
>
>> I have made some experiments on Solaris, and the Solaris
>> implementation does not follow its specification. The behaviour of
>> Solaris's strtoul seems to be the same as GNU libc's strtoul!
>
>
> That's actually what I expect—GNU libc's behavior matches that of legacy
> systems.
>
> There are some other non-GNU resources which describe the odd behavior,
> e.g.:
>
> "This function returns an unsigned long int. If the given string starts with
> a minus sign, the result is a negative number that's cast to be unsigned."
>
> <http://www.users.pjwstk.edu.pl/~jms/qnx/help/watcom/clibref/src/strtoul.html>
>
> (The page source dates this to 1999, so it could have been influenced by
> glibc, but it's somewhat unlikely because GNU/Linux wouldn't have been
> considered the gold standard back then. And the QNX folks probably don't
> even today.)
>
> There's an old discussion on comp.std.c dating back to September 1992,
> "strtoul on negative numbers". It strongly suggests that back then,
> everyone implemented the strange behavior.
>
> I guess we're stuck with it.
>
>
> --
> Florian Weimer / Red Hat Product Security Team