"C" UTF-8 trouble

Thomas Wolff towo@towo.net
Wed Oct 7 14:06:00 GMT 2009


Corinna Vinschen wrote:
> ...
>
> $ ./nll
> ANSI_X3.4-1968
>
> $ LANG=C.UTF-8 ./nll
> ANSI_X3.4-1968
>
> $ LANG=ja_JP ./nll
> EUC-JP
>
> $ LANG=ru_RU ./nll
> ISO-8859-5
>
> $ LANG=ru_UA ./nll
> KOI8-U
>
> $ LANG=zh_CN ./nll
> GB2312
>
> $ LANG=zh_TW ./nll
> BIG5
>
> Sigh.  Do we really need a translation table?
>   
Yes (sigh). And yes, that's what I had suggested before. Actually, 
"locale charmap" (on a system with a locale command) gives you the same 
information as "nll".
If you want a table, a fairly complete one is included in my package 
mined, file src/locales.t (generated from src/locales.cfg).
(Complete in the sense that all locales without explicit suffix not 
listed here map to ISO-8859-1; maybe I should also include them to 
distinguish unknown locales ...)
And, as becomes clear here, the syntax of charmap/codeset names is 
different between locale names and nl_langinfo,
e.g. eucJP vs. EUC-JP.

Thomas



More information about the Cygwin-developers mailing list