[PATCH 3/3] fhandler_pty_slave::setup_locale: respect charset == "UTF-8"

Takashi Yano takashi.yano@nifty.ne.jp
Wed Sep 2 09:41:04 GMT 2020


On Wed, 2 Sep 2020 10:30:14 +0200
Corinna Vinschen wrote:
> On Sep  1 18:19, Johannes Schindelin wrote:
> > When `LANG=en_US.UTF-8`, the detected `LCID` is 0x0409, which is
> > correct, but after that (at least if Pseudo Console support is enabled),
> > we try to find the default code page for that `LCID`, which is ASCII
> > (437). Subsequently, we set the Console output code page to that value,
> > completely ignoring that we wanted to use UTF-8.
> > 
> > Let's not ignore the specifically asked-for UTF-8 character set.
> > 
> > While at it, let's also set the Console output code page even if Pseudo
> > Console support is disabled; contrary to the behavior of v3.0.7, the
> > Console output code page is not ignored in that case.
> > 
> > The most common symptom would be that console applications which do not
> > specifically call `SetConsoleOutputCP()` but output UTF-8-encoded text
> > seem to be broken with v3.1.x when they worked plenty fine with v3.0.x.
> > 
> > This fixes https://github.com/msys2/MSYS2-packages/issues/1974,
> > https://github.com/msys2/MSYS2-packages/issues/2012,
> > https://github.com/rust-lang/cargo/issues/8369,
> > https://github.com/git-for-windows/git/issues/2734,
> > https://github.com/git-for-windows/git/issues/2793,
> > https://github.com/git-for-windows/git/issues/2792, and possibly quite a
> > few others.
> > 
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> >  winsup/cygwin/fhandler_tty.cc | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> 
> Ok guys, I'm not opposed to this change in terms of its result,

I am sorry, but I cannot agree with Johannes's patch.

For example, code page in Japan is CP932 by default.
In this case, cmd.exe, netsh.exe and so on are generate
messages in Japanese.

If the code page is set to CP_UTF8, messages from such
commands changes to english. I guess similar things happen
for other locales.

I do not prefer this result.

Furthermore, I looked into the issue:
https://github.com/git-for-windows/git/issues/2734
and I found that git-for-windows always use utf-8
encoding even if the locale is ja_JP.CP932.
It does not change coding based on locale or code
page.

Even with Johannes's patch, if mintty is started with
locale ja_JP.CP932, the file name will be garbled
bacause SetConsoleOutputCP(CP_UTF8) will not be called.

IMHO, it is the problem of git-for-windows rather
than cygwin and msys2.

To make current version of git-for-windows work, it is
necessary to set code page to CP_UTF8 regardless locale.
This does not make sense at all.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>


More information about the Cygwin-patches mailing list