Console codepage setting via chcp?
Wed Sep 23 16:49:00 GMT 2009
while mulling over the locale stuff it suddenly occured to me that
I made an assumption about the Windows console which never was true
and quite dumb as well.
Right now, if you switch the charset via the setlocale function, you
also switch the charset used for console output. This is done on the
grounds that the console isn't capable to switch the console set by
itself, as it is for terminal emulators like mintty. The problem with
this approach is even documented in setup2.sgml, just commented out.
If you use a tool like ssh to connect to a remote machine, then ssh
uses potentially another locale and charset than the remote shell.
ssh is always running in the "C" locale, while the remote shell could
easily run in "en_US.UTF-8". But since you can't control the console
codepage separately from the application locale...
After so many months of looking into the charset stuff it occured to me
just a few minutes ago, that there was *always* a way to switch the
codepage of the console in a fixed manner: chcp. If Cygwin uses the
codepage returned by GetConsoleOutputCP(), then it uses what the user
chose by running chcp, or the default OEM codepage. The alternate
charset, typically only used for the graphical characters anyway,
could be either CP 437, or what GetOEMCP() returns.
This way the charset used to print characters in the Windows console
is a nicely encapsulated user setting, just like with mintty, xterm,
and other terminal emulators.
I tested this on XP and W7 and it works fine. The documentation
would just have to be extended to explain to the user how to switch
the console output codepage using the native chcp tool.
My question is, what do you all think? Isn't that a much better
controllable setting then how it's done now?
The only downside from my point of view is that the user has to know the
codepage numbers. But these are already documented in the Cygwin docs
so that shouldn't be much of a problem. Maybe we could also create our
own tool to switch the codepage, which takes a charset name and translates
it to the corresponding codepage.
What do you think, is that a good solution, is it better or worse than
what we have now, or do you see serious problems with this?
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
More information about the Cygwin-developers