[PATCH 3/3] fhandler_pty_slave::setup_locale: respect charset == "UTF-8"

Corinna Vinschen corinna-cygwin@cygwin.com
Wed Sep 2 08:38:18 GMT 2020


On Sep  2 10:30, Corinna Vinschen wrote:
> On Sep  1 18:19, Johannes Schindelin wrote:
> > When `LANG=en_US.UTF-8`, the detected `LCID` is 0x0409, which is
> > correct, but after that (at least if Pseudo Console support is enabled),
> > we try to find the default code page for that `LCID`, which is ASCII
> > (437). Subsequently, we set the Console output code page to that value,
> > completely ignoring that we wanted to use UTF-8.
> > 
> > Let's not ignore the specifically asked-for UTF-8 character set.
> > 
> > While at it, let's also set the Console output code page even if Pseudo
> > Console support is disabled; contrary to the behavior of v3.0.7, the
> > Console output code page is not ignored in that case.
> > 
> > The most common symptom would be that console applications which do not
> > specifically call `SetConsoleOutputCP()` but output UTF-8-encoded text
> > seem to be broken with v3.1.x when they worked plenty fine with v3.0.x.
> > 
> > This fixes https://github.com/msys2/MSYS2-packages/issues/1974,
> > https://github.com/msys2/MSYS2-packages/issues/2012,
> > https://github.com/rust-lang/cargo/issues/8369,
> > https://github.com/git-for-windows/git/issues/2734,
> > https://github.com/git-for-windows/git/issues/2793,
> > https://github.com/git-for-windows/git/issues/2792, and possibly quite a
> > few others.
> > 
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> >  winsup/cygwin/fhandler_tty.cc | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> 
> Ok guys, I'm not opposed to this change in terms of its result,
> but I'm starting to wonder why all this locale code in fhandler_tty
> is necessary at all.
> 
> I see that get_langinfo() calls __loadlocale and performs a lot of stuff
> on the charsets which looks like duplicates of the initial_setlocale()
> call performed at DLL startup.
> 
> If there's anything missing in the initial_setlocale() call which would
> be required by the pseudo tty code?  What exactly is it?  The codepage?
> And why can't we just add the info to cygheap->locale at initial_setlocale()
> time so it's available at exec time without going through all this hassle
> every time?
> 
> Apart from that, all this locale/charset/lcid stuff should be concentrated
> in nlsfunc.cc ideally.

get_locale_from_env() and get_langinfo() should go away.  If we just
need a codepage for get_ttyp ()->term_code_page, we should really find a
way to do this from within internal_setlocale().


Corinna


More information about the Cygwin-patches mailing list