Problem with Bash regex test case sensitivity

Lee Rothstein lee@veritech.com
Sat Dec 4 21:49:00 GMT 2010


On 12/4/2010 10:06 AM, Corinna Vinschen wrote:

 > On Dec  4 10:05, Lee wrote:

 >> On 12/3/10, Eric Blake <eblake@ > wrote:
 >>> Read the FAQ.  http://www.faqs.org/faqs/unix-faq/shell/bash/, E9.

 >> Which says the en_US locale collates the upper and lower case
 >> letters like this:
 >>     AaBb...Zz

 >> I got that much :)  What I don't get is why someone would _want_ the
 >> collating sequence to be AaBb... or why that sequence was picked for
 >> en_US instead of using the natural order of A-Za-z.

 > It's not the "natural" order, it's an arbitrary order which has been
 > chosen back in 1963 when the ASCII code has been defined.  It's not used
 > as "natural" order outside of computer systems and it's not even the
 > natural order on some computer systems (See EBCDIC).

 > If you take a look into a hardcopy encyclopedia written in english,
 > you'll be very comfortable that the words are ordered lexicographically
 > instead of in ASCII coding, probably.  Needless to say that ordering
 > criteria for non-english languages may contain more characters in the
 > sequence, in german for instance

 >   "AaäBb...Ooö...Ssß...Uuü...Zz"

 > So, let's reiterate:

 > - If I need the order for the computer language, I say so:

 >    LC_COLLATE=C.UTF-8

 > - Otherwise, if I need the order for the natural language, I
 >   say so:

 >    LC_COLLATE=en_US.UTF-8
 >    LC_COLLATE=de_DE.UTF-8
 >    ...

Here's my takeaway, given Corinna's interesting and complete
context, and my intents. (My intentions, BTW, are for my scripts
to have as much generality as possible [given my limited skills
;-|].)

Therefore, instead of using '[A-Z]' to represent caps, I should
have used (?) the Posixly Correct, '[:upper:]'.

However, the test script (attached) still doesn't work on either
my Cygwin config, or a Linux config, with this change. (I have
not yet made the above indicated environment variable changes,
since I am still waiting for clarification to the new issue I
bring up, here.)

The latter test would, IMHO, seem to imply that the changes to
NIX shells were mandated by I18N considerations, BUT the other
required changes in code or default setting were NOT implemented.

This would seem to penalize only those folks who are conversant
with long-term convention of the 'NIX world.

Please correct my misunderstanding if I'm wrong!

Lee

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: t_regex
URL: <http://cygwin.com/pipermail/cygwin/attachments/20101204/ad79ccfd/attachment.ksh>
-------------- next part --------------
--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


More information about the Cygwin mailing list