This is the mail archive of the cygwin-developers mailing list for the Cygwin project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
According to Thomas Wolff on 1/8/2010 7:40 AM:Part of my point was to work this out more precisely, so please let me stir around once more:
While Andy had a valid point in finding *format* to be described as a
"character string" and relating that to a generic POSIX definition of
character,
this certainly does not justify the current behaviour of slient dropping
and reporting partial success because that is not one of the options in
the "RETURN VALUE" section;
also I don't see what Andy's claim "Including invalid bytes in the
format string is undefined behaviour." is based on.
Per POSIX, printf is only defined if you pass a valid character string as the format.Based on what? The manpage lists a number of cases of "results are undefined" explicitly (e.g. insufficient arguments) but the case of an invalid character in the format string is *not* among them, unless you would account the EILSEQ clause to that aim, which I wouldn't because the format string is not a wide character string.
That may be the case (I'm almost convinced meanwhile by Andy and you) but yet... see above, it doesn't mean "completely undefined".If you pass an 8-bit value but it is not a character, then you did not pass a valid format string.
That's why your behavior was undefined, and so ANYTHING can happenNo, not quite anything, APIs are not pure maths, Return Value conditions still have to be met.
...Yes, please.
My opinion is that it would still be nice to keep "C" in the UTF-8 charset (to encourage people to fix their programs that do not comply with POSIX rules about the C locale), but to fix 8-bit transparency issues in as many APIs as possible (such as printf) so that invalid characters are at least still handled as transparently-clean 8-bit bytes.
In the long run,With this, I fully agree.
sticking with the UTF-8 charset will only be doing users a favor, even if
we end up having to point people to the FAQ about locale implications.
------ Thomas
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |