UTF8 support in Cygwin

Chris January chris@atomice.net
Wed Jul 3 03:07:00 GMT 2002


I am working on a patch which would add UTF8 support to Cygwin.
i.e. Unicode filenames would be encoded as UTF8 before being returned by,
e.g., readdir and then converted back to Unicode before being passed to the
Windows API.
This would solve Ville Herva's problem where he/she wanted to back up a
filesystem containing Unicode filenames using Cygwin, but found that the
Unicode characters were converted to question marks. Also, with an
appropriate terminal, it is actually possible to view the Unicode characters
(altough at the moment, it is not possible to input them correctly AFAIK).
The code is currently guarded by a CYGWIN environment variable flag, 'utf8'.

An example of the way I'm doing this is:
  if (use_utf8)
    {
      WCHAR wbuf[MAX_PATH];
      if (MultiByteToWideChar (CP_UTF8, 0, get_win32_name(), -1,
                               wbuf, MAX_PATH) == 0)
        {
          __seterrno ();
          goto done;
        }
      x = CreateFileW (wbuf, access, shared, &sa, creation_distribution,
                       file_attributes, 0);
    }
  else
    x = CreateFileA (get_win32_name (), access, shared, &sa,
creation_distribution,
      file_attributes, 0);

My question is, does anyone have any objections to doing things this way,
and if so, can they suggest a better way? I don't want to patch the whole of
Cygwin and then have to re-write everything at a later date.

Regards
Chris




More information about the Cygwin-developers mailing list