[BUG core?] Regression with parsing Windows’ command-line
Takashi Yano
takashi.yano@nifty.ne.jp
Sat Dec 3 10:28:10 GMT 2022
On Fri, 2 Dec 2022 19:40:30 -0800
Ilya Zakharevich wrote:
> On Wed, Nov 16, 2022 at 04:48:25AM -0800, I wrote:
> > De-quoting (converting the Windows’ command-line into argc/argv) does
> > not remove double quotes if characters not fit for 8-bit (?) are present.
> >
> > Broken in: CYGWIN_NT-6.1 Bu 3.3.4(0.341/5/3) 2022-01-31 19:35 x86_64 Cygwin
> > Works in: CYGWIN_NT-6.1-WOW Bu 2.2.1(0.289/5/3) 2015-08-20 11:40 i686 Cygwin
> >
> > To reproduce, do in CMD’s command line:
> >
> > D:\> D:\Programs\cygwin2022\bin\perl -wle "print for @ARGV" . "/i/" "/и/" .
> > .
> > /i/
> > "/и/"
> > .
>
> I triple-checked
> • with a Win10 machine (and a version of cygwin given above),
> • with a fresh latest(=test)-cygwin-dll installation on a Win7 (as above) machine.
>
> Same bug everywhere.
This certainly seems to be a problem of cygwin1.dll.
Though I am not sure this is the right thing, I have confirmed
that the following patch solves the issue.
diff --git a/newlib/libc/locale/lctype.c b/newlib/libc/locale/lctype.c
index 644669765..732d132e1 100644
--- a/newlib/libc/locale/lctype.c
+++ b/newlib/libc/locale/lctype.c
@@ -25,11 +25,20 @@
#define LCCTYPE_SIZE (sizeof(struct lc_ctype_T) / sizeof(char *))
+#ifdef __CYGWIN__
+static char numsix[] = { '\6', '\0'};
+#else
static char numone[] = { '\1', '\0'};
+#endif
const struct lc_ctype_T _C_ctype_locale = {
+#ifdef __CYGWIN__
+ "UTF-8", /* codeset */
+ numsix /* mb_cur_max */
+#else
"ASCII", /* codeset */
numone /* mb_cur_max */
+#endif
#ifdef __HAVE_LOCALE_INFO_EXTENDED__
,
{ "0", "1", "2", "3", "4", /* outdigits */
--
Takashi Yano <takashi.yano@nifty.ne.jp>
More information about the Cygwin
mailing list