[PATCH 3/3] fhandler_pty_slave::setup_locale: respect charset == "UTF-8"

Johannes Schindelin johannes.schindelin@gmx.de
Tue Sep 1 16:19:16 GMT 2020

When `LANG=en_US.UTF-8`, the detected `LCID` is 0x0409, which is
correct, but after that (at least if Pseudo Console support is enabled),
we try to find the default code page for that `LCID`, which is ASCII
(437). Subsequently, we set the Console output code page to that value,
completely ignoring that we wanted to use UTF-8.

Let's not ignore the specifically asked-for UTF-8 character set.

While at it, let's also set the Console output code page even if Pseudo
Console support is disabled; contrary to the behavior of v3.0.7, the
Console output code page is not ignored in that case.

The most common symptom would be that console applications which do not
specifically call `SetConsoleOutputCP()` but output UTF-8-encoded text
seem to be broken with v3.1.x when they worked plenty fine with v3.0.x.

This fixes https://github.com/msys2/MSYS2-packages/issues/1974,
https://github.com/git-for-windows/git/issues/2792, and possibly quite a
few others.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
 winsup/cygwin/fhandler_tty.cc | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/winsup/cygwin/fhandler_tty.cc b/winsup/cygwin/fhandler_tty.cc
index 06789a500..414c26992 100644
--- a/winsup/cygwin/fhandler_tty.cc
+++ b/winsup/cygwin/fhandler_tty.cc
@@ -2859,6 +2859,15 @@ fhandler_pty_slave::setup_locale (void)
   char charset[ENCODING_LEN + 1] = "ASCII";
   LCID lcid = get_langinfo (locale, charset);

+  /* Special-case the UTF-8 character set */
+  if (strcasecmp (charset, "UTF-8") == 0)
+    {
+      get_ttyp ()->term_code_page = CP_UTF8;
+      SetConsoleCP (CP_UTF8);
+      SetConsoleOutputCP (CP_UTF8);
+      return;
+    }
   /* Set console code page from locale */
   if (get_pseudo_console ())

More information about the Cygwin-patches mailing list