This is the mail archive of the cygwin-developers mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: "C" character set (again)


According to Thomas Wolff on 1/8/2010 7:40 AM:
> While Andy had a valid point in finding *format* to be described as a
> "character string" and relating that to a generic POSIX definition of
> character,
> this certainly does not justify the current behaviour of slient dropping
> and reporting partial success because that is not one of the options in
> the "RETURN VALUE" section;
> also I don't see what Andy's claim "Including invalid bytes in the
> format string is undefined behaviour." is based on.

Per POSIX, printf is only defined if you pass a valid character string as
the format.  If you pass an 8-bit value but it is not a character, then
you did not pass a valid format string.  That's why your behavior was
undefined, and so ANYTHING can happen (the current approach of stopping
right there, or the proposed approach of still being 8-bit clean and
passing on the invalid character anyways).  That's the whole point of
undefined - the standard doesn't say what has to happen, because there is
no rule about what has to happen.

My opinion is that it would still be nice to keep "C" in the UTF-8 charset
(to encourage people to fix their programs that do not comply with POSIX
rules about the C locale), but to fix 8-bit transparency issues in as many
APIs as possible (such as printf) so that invalid characters are at least
still handled as transparently-clean 8-bit bytes.  In the long run,
sticking with the UTF-8 charset will only be doing users a favor, even if
we end up having to point people to the FAQ about locale implications.

-- 
Don't work too hard, make some time for fun as well!

Eric Blake             ebb9@byu.net


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]