This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: Fix localedef collation handling of <U0000> (bug 15948)
- From: "Joseph S. Myers" <joseph at codesourcery dot com>
- To: Roland McGrath <roland at hack dot frob dot com>
- Cc: <libc-alpha at sourceware dot org>
- Date: Tue, 24 Sep 2013 22:05:01 +0000
- Subject: Re: Fix localedef collation handling of <U0000> (bug 15948)
- Authentication-results: sourceware.org; auth=none
- References: <Pine dot LNX dot 4 dot 64 dot 1309122206530 dot 28584 at digraph dot polyomino dot org dot uk> <20130924215529 dot 996342C099 at topped-with-meat dot com>
On Tue, 24 Sep 2013, Roland McGrath wrote:
> > as can be seen by adding "assert (runp->nwcs > 0);" before that code,
>
> So add that assert.
>
> > This patch causes such a sequence to be treated as length 1 instead.
>
> This seems reasonable on its face. But I don't understand enough about the
> context to be sure how this case comes about or what it ought to mean.
> It would be much easier to evaluate with an example of how the case occurs.
The examples are the cases where the assertion fails - such as
ar_SA.UTF-8. I don't know *why* locales/ar_SA includes <U0000> in
LC_COLLATE (more generally, I'm not sure how much individual locales need
their own collation information at all, as opposed to using the ISO 14651
information, which, as noted in bug 14095, could do with someone figuring
out how to update it), but it's the appearance of collation information
for <U0000> in the locale sources that triggers this problem.
--
Joseph S. Myers
joseph@codesourcery.com