[PATCH] Cygwin: pty: Add workaround for ISO-2022 and ISCII in convert_mb_str().

Takashi Yano takashi.yano@nifty.ne.jp
Fri Sep 11 18:37:58 GMT 2020


On Sat, 12 Sep 2020 02:38:43 +0900
Takashi Yano via Cygwin-patches <cygwin-patches@cygwin.com> wrote:
> On Sat, 12 Sep 2020 01:05:04 +0900
> Takashi Yano via Cygwin-patches <cygwin-patches@cygwin.com> wrote:
> > On Fri, 11 Sep 2020 16:06:01 +0200
> > Corinna Vinschen wrote:
> > > On Sep 11 21:35, Takashi Yano via Cygwin-patches wrote:
> > > > Hi Corinna,
> > > > 
> > > > On Fri, 11 Sep 2020 14:08:40 +0200
> > > > Corinna Vinschen wrote:
> > > > > On Sep 11 19:54, Takashi Yano via Cygwin-patches wrote:
> > > > > > - In convert_mb_str(), exclude ISO-2022 and ISCII from the processing
> > > > > >   for the case that the multibyte char is splitted in the middle.
> > > > > >   The reason is as follows.
> > > > > >   * ISO-2022 is too complicated to handle correctly.
> > > > > >   * Not sure what to do with ISCII.
> > > > > > ---
> > > > > >  winsup/cygwin/fhandler_tty.cc | 9 +++++++--
> > > > > >  1 file changed, 7 insertions(+), 2 deletions(-)
> > > > > > 
> > > > > > diff --git a/winsup/cygwin/fhandler_tty.cc b/winsup/cygwin/fhandler_tty.cc
> > > > > > index 37d033bbe..ee5c6a90a 100644
> > > > > > --- a/winsup/cygwin/fhandler_tty.cc
> > > > > > +++ b/winsup/cygwin/fhandler_tty.cc
> > > > > > @@ -117,6 +117,9 @@ CreateProcessW_Hooked
> > > > > >    return CreateProcessW_Orig (n, c, pa, ta, inh, f, e, d, si, pi);
> > > > > >  }
> > > > > >  
> > > > > > +#define IS_ISO_2022(x) ( (x) >= 50220 && (x) <= 50229 )
> > > > > > +#define IS_ISCII(x) ( (x) >= 57002 && (x) <= 57011 )
> > > > > > +
> > > > > >  static void
> > > > > >  convert_mb_str (UINT cp_to, char *ptr_to, size_t *len_to,
> > > > > >  		UINT cp_from, const char *ptr_from, size_t len_from,
> > > > > > @@ -126,8 +129,10 @@ convert_mb_str (UINT cp_to, char *ptr_to, size_t *len_to,
> > > > > >    tmp_pathbuf tp;
> > > > > >    wchar_t *wbuf = tp.w_get ();
> > > > > >    int wlen = 0;
> > > > > > -  if (cp_from == CP_UTF7)
> > > > > > -    /* MB_ERR_INVALID_CHARS does not work properly for UTF-7.
> > > > > > +  if (cp_from == CP_UTF7 || IS_ISO_2022 (cp_from) || IS_ISCII (cp_from))
> > > > > > +    /* - MB_ERR_INVALID_CHARS does not work properly for UTF-7.
> > > > > > +       - ISO-2022 is too complicated to handle correctly.
> > > > > > +       - FIXME: Not sure what to do for ISCII.
> > > > > >         Therefore, just convert string without checking */
> > > > > >      wlen = MultiByteToWideChar (cp_from, 0, ptr_from, len_from,
> > > > > >  				wbuf, NT_MAX_PATH);
> > > > > > -- 
> > > > > > 2.28.0
> > > > > 
> > > > > I'd prefer to not handle them at all.  We just don't support these
> > > > > charsets, same as JIS, EBCDIC, you name it, which are not ASCII
> > > > > compatible.  Let's please just drop any handling for these weird
> > > > > or outdated codepages.
> > > > 
> > > > What do you mean by "just drop any handling"? 
> > > > 
> > > > Do you mean remove following if block?
> > > > > > +  if (cp_from == CP_UTF7 || IS_ISO_2022 (cp_from) || IS_ISCII (cp_from))
> > > > > > +    /* - MB_ERR_INVALID_CHARS does not work properly for UTF-7.
> > > > > > +       - ISO-2022 is too complicated to handle correctly.
> > > > > > +       - FIXME: Not sure what to do for ISCII.
> > > > > >         Therefore, just convert string without checking */
> > > > > >      wlen = MultiByteToWideChar (cp_from, 0, ptr_from, len_from,
> > > > > >  				wbuf, NT_MAX_PATH);
> > > > In this case, the conversion for ISO-2022, ISCII and UTF-7 will
> > > > not be done correctly.
> > > > 
> > > > Or skip charset conversion if the codepage is EBCDIC, ISO-2022
> > > > or ISCII? What should we do for UTF-7?
> > > 
> > > Nothing, just like for any other of these weird charsets.  Cygwin never
> > > supported any charset which wasn't at least ASCII compatible in the
> > > 0 <= x <= 127 range.  Just ignore them and the possibility that a
> > > user chooses them for fun.
> > > 
> > > > What should happen if user or apps chage codepage to one of them?
> > > 
> > > Garbage output, I guess.  We shouldn't really care.
> > 
> > Do you mean a patch attached?
> > 
> > Please try:
> > (1) Open mintty with "env CYGWIN=disable_pcon mintty".
> > (2) Start cmd.exe in that mintty.
> > (3) Try chcp such as
> >     37 (EBCDIC),
> >     65000 (UTF-7),
> >     50220 (ISO-2022),
> >     and 57002 (ISCII).
> > (4) Execute dir or some other commands in cmd.exe.
> > 
> > For 65000, 50220 adn 57002, even the prompt will be broken.
> > Are the results as you expected?
> > 
> > If pseudo console is enabled, all the above are work without
> > problem. With the previous patch, the results was sane even
> > if pseudo console is disabled.
> 
> How about the patch attached?
> I think this is safer than previous patch.

I have revised this patch to fit current git head, and submit
to cygwin-patches@cygwin.com.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>


More information about the Cygwin-patches mailing list