[PATCH 3/3] fhandler_pty_slave::setup_locale: respect charset == "UTF-8"

Johannes Schindelin Johannes.Schindelin@gmx.de
Fri Sep 4 06:23:42 GMT 2020


Hi Takashi,

On Fri, 4 Sep 2020, Takashi Yano via Cygwin-patches wrote:

> Hi Johannes and Corinna,
>
> On Tue, 1 Sep 2020 18:19:16 +0200 (CEST)
> Johannes Schindelin wrote:
>
> > When `LANG=en_US.UTF-8`, the detected `LCID` is 0x0409, which is
> > correct, but after that (at least if Pseudo Console support is enabled),
> > we try to find the default code page for that `LCID`, which is ASCII
> > (437). Subsequently, we set the Console output code page to that value,
> > completely ignoring that we wanted to use UTF-8.
> >
> > Let's not ignore the specifically asked-for UTF-8 character set.
> >
> > While at it, let's also set the Console output code page even if Pseudo
> > Console support is disabled; contrary to the behavior of v3.0.7, the
> > Console output code page is not ignored in that case.
> >
> > The most common symptom would be that console applications which do not
> > specifically call `SetConsoleOutputCP()` but output UTF-8-encoded text
> > seem to be broken with v3.1.x when they worked plenty fine with v3.0.x.
> >
> > This fixes https://github.com/msys2/MSYS2-packages/issues/1974,
> > https://github.com/msys2/MSYS2-packages/issues/2012,
> > https://github.com/rust-lang/cargo/issues/8369,
> > https://github.com/git-for-windows/git/issues/2734,
> > https://github.com/git-for-windows/git/issues/2793,
> > https://github.com/git-for-windows/git/issues/2792, and possibly quite a
> > few others.
> >
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> >  winsup/cygwin/fhandler_tty.cc | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> >
> > diff --git a/winsup/cygwin/fhandler_tty.cc b/winsup/cygwin/fhandler_tty.cc
> > index 06789a500..414c26992 100644
> > --- a/winsup/cygwin/fhandler_tty.cc
> > +++ b/winsup/cygwin/fhandler_tty.cc
> > @@ -2859,6 +2859,15 @@ fhandler_pty_slave::setup_locale (void)
> >    char charset[ENCODING_LEN + 1] = "ASCII";
> >    LCID lcid = get_langinfo (locale, charset);
> >
> > +  /* Special-case the UTF-8 character set */
> > +  if (strcasecmp (charset, "UTF-8") == 0)
> > +    {
> > +      get_ttyp ()->term_code_page = CP_UTF8;
> > +      SetConsoleCP (CP_UTF8);
> > +      SetConsoleOutputCP (CP_UTF8);
> > +      return;
> > +    }
> > +
> >    /* Set console code page from locale */
> >    if (get_pseudo_console ())
> >      {
> > --
> > 2.27.0
>
> I would like to propose a counter patch attached.
> What do you think of this patch?
>
> This patch does not treat UTF-8 as special.

Sure, but it only fixes the issue in `disable_pcon` mode in the current
tip commit. That's because a new Pseudo Console is created for every
spawned non-Cygwin console application, and that new Pseudo Console does
_not_ have the code page set by your patch.

I verified that this patch works around the issue in `disable_pcon` mode,
which is good.

Ciao,
Dscho


More information about the Cygwin-patches mailing list