CYGWIN=codepage? Or LC_CTYPE=foo?

Gregg Tavares unison@greggman.com
Mon Apr 7 05:31:00 GMT 2008


Corinna Vinschen wrote: 

> And while I already confused myself no end, here's another question. 
> 
> Shouldn't the (default) setting of LANG, LC_CTYPE and friends be based 
> on what the underlying OS is set to? 

LANG yes, LC_CTYPE no? 

At least to my understanding, Windows runs in Unicode internally, Most 
current distributions of Linux run in UTF8. So, if the goal is to have 
Cygwin emulate Linux and as well, do the right thing most of the time 
on Windows, then it seems like LC_CTYPE should be set to 
codepage:utf-8 by default. That means it will work like Linux (better 
emulation) and it will work like Windows (Windows is after all running 
in Unicode). 

The only time LC_CTYPE has any meaning is for very old unix programs 
pre utf-8. Since in Cygwin, all unix based programs are compiled 
against the Cygwin DLL and supporting libraries then they will all do 
the correct things if LC_CTYPE defaults to codepage:utf-8. 

If LC_CTYPE is set to something else than unless the user manually 
sets it back to UTF-8, programs that attempt to deal with filenames 
that don't fit the current codepage will fail. That seems bad to me. 

Also, if LC_CTYPE is not set to codepage:utf-8 then any unix programs 
communicating filenames machine to machine will almost always fail 
for non ASCII filenames.



More information about the Cygwin-developers mailing list