This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug regex/23393] Handle [a-z] and [A-Z] in consistent portable fashion regardless of locale.


https://sourceware.org/bugzilla/show_bug.cgi?id=23393

--- Comment #24 from Florian Weimer <fweimer at redhat dot com> ---
(In reply to Carlos O'Donell from comment #22)
> (In reply to Florian Weimer from comment #20)
> > The point Rich and I are making is that there is no requirement in POSIX to
> > have ranges following collation sorting.  Our current implementations do
> > this, but it's not required by POSIX.  We can change the code (and not the
> > data).
> 
> This is not my interpretation.
> 
> http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html
> 
> ~~~
> 7. In the POSIX locale, a range expression represents the set of collating
> elements that fall between two elements in the collation sequence, inclusive.
> ~~~
> 
> We would not meet that rule if we used code points?

For ASCII-based implementations, the order is the same.  From “LC_COLLATE
Category in the POSIX Locale”:

# This is the minimum input for the POSIX locale definition for the
# LC_COLLATE category. Characters in this list are in the same order
# as in the ASCII codeset.

And a cursory glance at the definition suggests that the comment is accurate.

> You argue that the "unspecified behaviour" (not undefined), would be changed?

Yes, or not be changed, for the en_US locale and many common range expressions.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]