[PATCH 3/3] fhandler_pty_slave::setup_locale: respect charset == "UTF-8"

Takashi Yano takashi.yano@nifty.ne.jp
Fri Sep 4 09:21:49 GMT 2020


Hi Corinna,

On Thu, 3 Sep 2020 19:59:12 +0200
Corinna Vinschen wrote:
> On Sep  2 18:38, Corinna Vinschen wrote:
> > Hi Takashi,
> > 
> > On Sep  3 01:25, Takashi Yano via Cygwin-patches wrote:
> > > Hi Corinna,
> > > 
> > > On Wed, 2 Sep 2020 17:24:50 +0200
> > > Corinna Vinschen  wrote:
> > > > > > get_locale_from_env() and get_langinfo() should go away.  If we just
> > > > > > need a codepage for get_ttyp ()->term_code_page, we should really find a
> > > > > > way to do this from within internal_setlocale().
> > > > > 
> > > > > I looked into internal_setlocale() code, but I could not found
> > > > > the code which handles thecode page. I found the code handling
> > > > > the code page in __set_charset_from_locale() function in nlsfuncs.cc,
> > > > > but it does not return code page itself. Could you please explain
> > > > > more detail of your idea?
> > > > 
> > > > I had none yet :)  I was just musing, without actually thinking about a
> > > > solution.  But I think this isn't very complicated.  Given this is
> > > > inside Cygwin, nothing keeps the function to have a well-defined
> > > > side-effect, as in setting a (not yet existing) member "term_code_page"
> > > > of cygheap->locale.
> > > > 
> > > > Kind of like this:
> > > > [...]
> > > I have tried your code, however, it does not work as expected.
> > > It seems that __set_charset_from_locale() is not called.
> > > cygheap->locale.term_code_page is always 0.
> > 
> > Ah, right!  Take a look into newlib/libc/locale/locale.c, function
> > __loadlocale().  This function is called from _setlocale_r().  However,
> > it calls __set_charset_from_locale() *only* if the charset isn't already
> > given explicitely in the LC_* or LANG string, because otherwise we
> > already know the charset, after all.
> > 
> > Darn!  That foils my plans for world domination...
> > 
> > > Let me consider a while.
> > 
> > Thanks, I'll do the same.  I'd really like to simplify this stuff
> > and doing the locale shuffle in two entirely different locations
> > at different times is prone to getting out of sync.
> 
> The only idea I had so far was, changing the way __set_charset_from_locale
> works from within _setlocale_r:
> 
> We could add a Cygwin-specific function only fetching the codepage and
> call it unconditionally from _setlocale_r.  __set_charset_from_locale is
> called with a new parameter "codepage", so it doesn't have to fetch the
> CP by itself, but it's still only called from _setlocale_r if necessary.
> 
> Would that be sufficient?  The CP conversion from 20127/ASCII to 65001/UTF8
> could be done at the point the codepage is actually required.

I think I have found the answer to your request.
Patch attached. What do you think of this patch?

Calling initial_setlocale() is necessary because
nl_langinfo() always returns "ANSI_X3.4-1968"
regardless locale setting if this is not called.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Cygwin-pty-Replace-pty-specific-locale-functions-wit.patch
Type: application/octet-stream
Size: 5000 bytes
Desc: not available
URL: <https://cygwin.com/pipermail/cygwin-patches/attachments/20200904/333972b3/attachment.obj>


More information about the Cygwin-patches mailing list