This is the mail archive of the
cygwin
mailing list for the Cygwin project.
Re: Q: Is anybody here using the CYGWIN=codepage:oem setting?
On Mar 19 21:11, Corinna Vinschen wrote:
> On Mar 19 19:41, Eric Blake wrote:
> > Corinna Vinschen <corinna-cygwin <at> cygwin.com> writes:
> > > ...unless Cygwin itself would call setlocale().
> >
> > I'm not a fan of that. POSIX is explicit that an application that
> > intentionally avoids calling setlocale() shall behave as though it had called
> > setlocale(LC_ALL,"C").
> > [...]
> But I admit that I'm not very happy with this idea either. Still, we
> have to convert from MB to WC and vice-versa independently of the
> application, while other systems based on byte charsets simply don't
> have this problem.
Here's another idea:
If the codeset is not UTF-8, and if a filename contains wide chars not
representable in the current ANSI codeset, use the good old ASCII "SO/SI"
method.
Example: Assuming the ANSI codepage is CP1252. Assuming the filename
is in UTF-16
/dir/to/foo\x1234bar
All chars except for \x1234 are convertible to the current ANSI code
page. The convertible chars are converted as usual. The
non-convertible characters are converted to an ASCII SO/SI sequence:
/dir/to/foo\x0e\x12\x34\x0fbar
On the way back, Cygwin converts SO/SI sequences back to their
UTF-16 counterpart and converts everything else using the current\
codepage to UTF-16 conversion.
This would allow to manipulate all files on the disk regardless of
using characters invalid in the current CP.
Does that solution make sense?
Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/