[PATCH 3/3] fhandler_pty_slave::setup_locale: respect charset == "UTF-8"

Johannes Schindelin Johannes.Schindelin@gmx.de
Wed Sep 2 06:06:43 GMT 2020


Hi,

On Tue, 1 Sep 2020, Johannes Schindelin wrote:

> When `LANG=en_US.UTF-8`, the detected `LCID` is 0x0409, which is
> correct, but after that (at least if Pseudo Console support is enabled),
> we try to find the default code page for that `LCID`, which is ASCII
> (437). Subsequently, we set the Console output code page to that value,
> completely ignoring that we wanted to use UTF-8.
>
> Let's not ignore the specifically asked-for UTF-8 character set.
>
> While at it, let's also set the Console output code page even if Pseudo
> Console support is disabled; contrary to the behavior of v3.0.7, the
> Console output code page is not ignored in that case.
>
> The most common symptom would be that console applications which do not
> specifically call `SetConsoleOutputCP()` but output UTF-8-encoded text
> seem to be broken with v3.1.x when they worked plenty fine with v3.0.x.
>
> This fixes https://github.com/msys2/MSYS2-packages/issues/1974,
> https://github.com/msys2/MSYS2-packages/issues/2012,
> https://github.com/rust-lang/cargo/issues/8369,
> https://github.com/git-for-windows/git/issues/2734,
> https://github.com/git-for-windows/git/issues/2793,
> https://github.com/git-for-windows/git/issues/2792, and possibly quite a
> few others.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  winsup/cygwin/fhandler_tty.cc | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/winsup/cygwin/fhandler_tty.cc b/winsup/cygwin/fhandler_tty.cc
> index 06789a500..414c26992 100644
> --- a/winsup/cygwin/fhandler_tty.cc
> +++ b/winsup/cygwin/fhandler_tty.cc
> @@ -2859,6 +2859,15 @@ fhandler_pty_slave::setup_locale (void)
>    char charset[ENCODING_LEN + 1] = "ASCII";
>    LCID lcid = get_langinfo (locale, charset);
>
> +  /* Special-case the UTF-8 character set */
> +  if (strcasecmp (charset, "UTF-8") == 0)
> +    {
> +      get_ttyp ()->term_code_page = CP_UTF8;
> +      SetConsoleCP (CP_UTF8);
> +      SetConsoleOutputCP (CP_UTF8);
> +      return;
> +    }
> +

Just a word of warning: while this patch can be ported to a634adda5
(libm/machine/arm: Rename s*_fma.c -> s*_fma_arm.c, 2020-09-01) relatively
easily (and the first two patches of this patch series cannot, as they are
no longer applicable after the complete redesign of the Pseudo Console
support), it only works as intended in the `disable_pcon` mode.

The new design calls for Pseudo Consoles to be created per spawned console
application.

And I have not found any way to convince my local version of the runtime
to change the code page of these Pseudo Consoles away from the rather
unfortunate default 437.

This is a problem.

Take for example https://github.com/git-for-windows/git/issues/2793.
Telling the users that they should patch node.js and recompile is probably
not going to fly.

Hopefully there is a way to fix this, otherwise Pseudo Console support
will continue to be quite the support burden.

Ciao,
Johannes

>    /* Set console code page from locale */
>    if (get_pseudo_console ())
>      {
> --
> 2.27.0
>
>


More information about the Cygwin-patches mailing list