Quotes around command-line argument that has unicode characters are not removed
Mikhail Usenko via cygwin
cygwin@cygwin.com
Thu Mar 22 13:35:00 GMT 2018
On Thu, 22 Mar 2018 01:15:00 +0100
Dmitry Katsubo via cygwin <...> wrote:
> Dear Cygwin community,
>
> I observe the following on my Cygwin: when I put quotes around file that has
> non-ASCII symbols, these quotes are passed to argv of the process literally,
> otherwise they are removed. I would expect that there is a consistency.
>
> I have written a small C program that displays arguments, and run it three
> times:
>
> #1 For the file with space, taken into quotes ("the file.txt") -- OK
> #2 For the file with non-ASCII characters (Château.txt) -- OK
> #3 For the file with non-ASCII characters, taken into quotes ("Château.txt") -- WRONG
>
> d:\cli> uname -a
> CYGWIN_NT-6.1-WOW PC 2.9.0(0.318/5/3) 2017-09-12 10:41 i686 Cygwin
>
> D:\cli> chcp
> Active code page: 866
>
> D:\cli> dir
> ...cut...
> 2018-03-22 00:43 0 Château.txt
> 2018-03-22 00:01 393 test.c
> 2018-03-22 00:01 150,230 test.exe
> 2018-03-21 00:15 186 test.pl
> 2018-03-22 00:43 0 the file.txt
> 2018-03-22 00:40 16 ÑекÑÑ Ð¿Ð»ÑÑ.txt
> 6 File(s) 150,825 bytes
> 2 Dir(s) 41,972,293,632 bytes free
>
> D:\cli> test "the file.txt"
> param 0 = test
> param 1 = the file.txt
> File 'the file.txt' was opened
>
> D:\cli> test Château.txt
> param 0 = test
> param 1 = Château.txt
> File 'Château.txt' was opened
>
> D:\cli> test "Château.txt"
> param 0 = test
> param 1 = "Château.txt"
> Failed to open '"Château.txt"': No such file or directory
>
> As one can see, the last run fails. I am a bit puzzled: how can I pass the name
> of the file with space and Unicode symbols? I need to do it in uniform way, as I
> am calling a Cygwin program from native Windows program, as in [1].
>
> D:\cli> test "ÑекÑÑ Ð¿Ð»ÑÑ.txt"
> param 0 = test
> param 1 = "ÑекÑÑ Ð¿Ð»ÑÑ.txt"
> Failed to open '"ÑекÑÑ Ð¿Ð»ÑÑ.txt"': No such file or directory
>
> I have search a bit, but I couldn't find a direct answer. From post [1] and [2]
> I see that compiler inserts the code to do some argument pre-processing like
> @pathnames [3], but what are exactly the rules? Is quote pre-processing done in
> dcrt0.cc:177 [4]?
>
> Any feedback is appreciated.
>
> [1] https://sourceware.org/ml/cygwin/2016-05/msg00082.html
> [2] http://daviddeley.com/autohotkey/parameters/parameters.htm
> [3] https://cygwin.com/cygwin-ug-net/using-specialnames.html#pathnames-at
> [4] https://github.com/openunix/cygwin/blob/master/winsup/cygwin/dcrt0.cc#L177
>
> === test.c ===
> #include <stdio.h>
> #include <errno.h>
> #include <string.h>
>
> int main(int argc, char* argv[])
> {
> for (int i = 0; i < argc; i++)
> {
> printf("param %d = %s\n", i, argv[i]);
> }
> FILE* f = fopen(argv[1], "r");
> if (f != NULL)
> {
> printf("File '%s' was opened\n", argv[1]);
> fclose(f);
> } else {
> printf("Failed to open '%s': %s\n", argv[1], strerror(errno));
> }
> return 0;
> }
>
> --
Hello, Dmintry,
consider these test cases:
Native (msvcrt) binary:
-----------------------
$ x86_64-w64-mingw32-gcc test.c -o test-win.exe
$ ldd test-win.exe
ntdll.dll => /cygdrive/c/Windows/SYSTEM32/ntdll.dll (0x7fa05900000)
KERNEL32.DLL => /cygdrive/c/Windows/system32/KERNEL32.DLL (0x7fa030e0000)
KERNELBASE.dll => /cygdrive/c/Windows/system32/KERNELBASE.dll (0x7fa028f0000)
msvcrt.dll => /cygdrive/c/Windows/system32/msvcrt.dll (0x7fa03220000)
-----------------------
Cygwin-flavor binary:
---------------------
$ gcc test.c -o test-cygwin.exe
$ ldd test-cygwin.exe
ntdll.dll => /cygdrive/c/Windows/SYSTEM32/ntdll.dll (0x7fa05900000)
KERNEL32.DLL => /cygdrive/c/Windows/system32/KERNEL32.DLL (0x7fa030e0000)
KERNELBASE.dll => /cygdrive/c/Windows/system32/KERNELBASE.dll (0x7fa028f0000)
cygwin1.dll => /usr/bin/cygwin1.dll (0x180040000)
---------------------
Create a file with non-ascii chars in the name:
-----------------------------------------------
$ touch "ÑекÑÑ Ð¿Ð»ÑÑ.txt"
-----------------------------------------------
Run both binaries in mintty with bash:
--------------------------------------
$ ./test-win "ÑекÑÑ Ð¿Ð»ÑÑ.txt"
param 0 = D:\wroot\test.cygwin\Quotes around command-line argument that has unicode characters are not removed\test-win.exe
param 1 = âââââ ââââ.txt
File 'âââââ ââââ.txt' was opened
$ ./test-cygwin "ÑекÑÑ Ð¿Ð»ÑÑ.txt"
param 0 = ./test-cygwin
param 1 = ÑекÑÑ Ð¿Ð»ÑÑ.txt
File 'ÑекÑÑ Ð¿Ð»ÑÑ.txt' was opened
--------------------------------------
Run the binaries in cmd.exe with bash:
--------------------------------------
$ ./test-win "ÑекÑÑ Ð¿Ð»ÑÑ.txt"
param 0 = D:\wroot\test.cygwin\Quotes around command-line argument that has unicode characters are not removed\test-win.exe
param 1 = ÐÑ
ÑÑÐ ÑÑâ Ñ.txt
File 'ÐÑ
ÑÑÐ ÑÑâ Ñ.txt' was opened
$ ./test-cygwin "ÑекÑÑ Ð¿Ð»ÑÑ.txt"
param 0 = ./test-cygwin
param 1 = ÑекÑÑ Ð¿Ð»ÑÑ.txt
File 'ÑекÑÑ Ð¿Ð»ÑÑ.txt' was opened
--------------------------------------
Run in bare cmd.exe
(/usr/bin/cygwin1.dll should be copied next to ./test-cygwin.exe)
-------------------
D:\wroot\test.cygwin\Quotes around command-line argument that has unicode characters are not removed>.\test-win.exe "ÑекÑÑ Ð¿Ð»ÑÑ.txt"
param 0 = .\test-win.exe
param 1 = ÐÑ
ÑÑÐ ÑÑâ Ñ.txt
File 'ÐÑ
ÑÑÐ ÑÑâ Ñ.txt' was opened
D:\wroot\test.cygwin\Quotes around command-line argument that has unicode characters are not removed>.\test-cygwin.exe "ÑекÑÑ Ð¿Ð»ÑÑ.txt"
param 0 = ./test-cygwin
param 1 = "ÑекÑÑ Ð¿Ð»ÑÑ.txt"
Failed to open '"ÑекÑÑ Ð¿Ð»ÑÑ.txt"': No such file or directory
-------------------
In bare cmd.exe native-msvcrt binary is working OK with quoted non-ascii
arguments, while cygwin-flavor binary is not. But I don't know exactly which
level here: cmd.exe or msvcrt.dll/cygwin1.dll is responsible for
such a behavior.
--
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
More information about the Cygwin
mailing list