This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
Re: iswxxxxx/towxxxer and Unicode
- To: Bruno Haible <haible at ilog dot fr>
- Subject: Re: iswxxxxx/towxxxer and Unicode
- From: "Joseph S. Myers" <jsm28 at cam dot ac dot uk>
- Date: Tue, 18 Jul 2000 21:53:38 +0100 (BST)
- cc: libc-alpha at sourceware dot cygnus dot com
On Tue, 18 Jul 2000, Bruno Haible wrote:
> > No. This means NUL must not be passed to any of these functions.
>
> The SUSV2 clearly says the contrary. It contains language sufficient
> to conclude that 0x0000 is a valid character and therefore is suitable
> as an argument to iswcntrl:
This can be deduced already from the 1990 C standard (though you need to
go to 1994 for the iswcntrl function):
3.13 multibyte character: A sequence of one or more bytes representing a
member of the extended character set of either the source or the execution
environment. The extended character set is a superset of the basic
character set.
5.2.1 Character sets: ... A byte with all bits set to 0, called the null
character, shall exist in the basic execution character set; it is used to
terminate a character string literal.
5.2.1.2 Multibyte characters: ... The single-byte characters defined in
5.2.1 shall be present. ... A byte with all bits zero shall be interpreted
as a null character independent of shift state.
7.10.7.2 The mbtowc function: ... It then determines the code for the
value of type wchar_t what corresponds to that multibyte character. (The
value of the code corresponding to the null character is zero.)
--
Joseph S. Myers
jsm28@cam.ac.uk