"C" UTF-8 trouble

Andy Koppe andy.koppe@gmail.com
Thu Oct 8 05:12:00 GMT 2009


2009/10/7 Andy Koppe:
> 2009/10/7 Corinna Vinschen:
>> At least, from the above it looks like all uppercase.  The KOI8s would
>> be covered by a translation table.
>>
>> The problem is, we *must* draw a line somewhere.
>
> I agree, better to just stick with __locale_charset(), unless problems
> do arise. FWIW, vim works fine with *.KOI8 locales.

Actually it's not quite right: on seeing "CP20866", vim falls back to
iso-8859-1. While this works on the surface, as it's just another
8-bit charset, things like case conversion or detecting word
boundaries might be incorrect.

Anyway, here's a fix that doesn't involve a translation table:

* libc/locale/nl_langinfo.c (nl_langinfo): Fall back to
__locale_charset only if the current locale does not specify a
charset.

--- newlib/libc/locale/nl_langinfo.c    7 Oct 2009 16:45:23 -0000       1.3
+++ newlib/libc/locale/nl_langinfo.c    8 Oct 2009 05:00:23 -0000
@@ -59,7 +59,11 @@ _DEFUN(nl_langinfo, (item),
    switch (item) {
        case CODESET:
 #ifdef __CYGWIN__
-               ret = __locale_charset ();
+               s = setlocale(LC_CTYPE, NULL);
+               if (s != NULL && (cs = strchr(s, '.')) != NULL)
+                       ret = cs + 1;
+               else
+                       ret = __locale_charset();
 #else
                ret = "";
                if ((s = setlocale(LC_CTYPE, NULL)) != NULL) {



More information about the Cygwin-developers mailing list