Console codepage setting via chcp?

Andy Koppe
Fri Sep 25 18:43:00 GMT 2009

2009/9/25 Corinna Vinschen:
>> - System objects will always be translated using UTF-8. This includes
>> file names, user names, and initial environment variables (and
>> probably more I'm not aware of).
> More than 10 minutes later I'm still thinking that this is the best
> solution in the long run.  There will be no situation in which any
> process running on the system has a different idea of a system object
> than any other process.  That could also help to avoid interoperability
> issues in client/server applications.

Yes, there's a lot to be said for keeping such complications to a
minimum. Here are some further deliberations on the topic:

The downside, of course, is that non-ASCII filenames created in a
non-UTF8 locale won't show up correctly in Windows, and vice versa.
But that's the same on Linux if the global setting is UTF-8 while the
terminal is set to something else. And the stock answer to any
complaints will be: Use UTF-8!

In any case, the DCxx scheme will ensure that things work correctly
within any particular locale.

And I guess the ^N scheme can go (or be disabled)?

>> - The "C" locale's charset will be UTF-8.
>> - There'll be language-neutral "C.<charset>" locales.
>> - The user's ANSI codepage will remain the default charset for
>> "language_TERRITORY" locales.

Thanks, this gives me something to work with for mintty. Luckily, due
to the everything-is-UTF-8 approach, no mingw wrapper is actually
needed after all, as it wouldn't make a difference to anything anyway.

>> -  The console charset will be set according to LC_ALL/LC_CTYPE/LANG
>> when cygwin1.dll is initialised. (Or will 'setcons' be needed for
>> that?)
> Hmm.  Unsure.  I know that Thomas dislikes the idea and you are not
> overly convinced either.  One of Thomas arguments is the non-standard
> tool necessary to switch the terminal charset.  I think that's not a
> valid argument.  There is no standard how to switch the charset used by
> a terminal.

As far as I know, xterm, rxvt, gnome-terminal and konsole all respect
the locale variables unless a program-specific option is used.

> So, utilizing the initial setting of LC_ALL/ff. is as good
> as defaulting to UTF-8 and allowing to switch via a setcons tool.

'setcons' requires a wrapper script, whereas the variables don't
necessarily, as they can be set in the Windows environment. This would
allow programs to be invoked directly from a shortcut and still
picking up the user's setting.

Also, one of the locale variables needs to be set anyway if one wants
to use something other than the default locale.

> I have
> found an easy way to allow a setcons tool which only switches the charset
> used by Cygwin.  It doesn't affect the setting in cmd, or made by chcp.

That's a good idea. I've come round to thinking that 'setcons' is
worth having in addition to the initial setting from the environment.

>> - setlocale() will have no effects beyond what's expected in Linux.
> Well... probably.  I'm not saying yes without asking a lawyer first.

:)  I put that a bit too probingly, didn't I?


More information about the Cygwin-developers mailing list