bug report: shell expansion in argv[] processing sensitive to LANG, e.g. "ls: cannot access '*.pdf': No such file or directory", but works okay in bash

L A Walsh cygwin@tlinx.org
Thu Apr 2 06:43:38 GMT 2020


On 2020/03/24 00:18, Jay Libove via Cygwin wrote:
> Problem:
> Under certain circumstances (see Steps to Reproduce, below) Cygwin programs' built-in argv[] globbing will produce unexpected:
> "{programName}: cannot access '{glob pattern}: No such file or directory"
> e.g.
> "ls: cannot access '*.pdf': No such file or directory"
> .. despite the fact that e.g. *.pdf definitely exists.
>   
----
    This isn't a bug or a problem, it is working normally as expected.
Cygwin programs don't have built-in argv[] globbing or processing.

    The problem you are seeing is because you are calling cygwin programs
from a windows shell.

    On windows, every program has to be built with glob processing.

    On unix, glob processing happens in the shell, so all unix 
(linux+cygwin)
type programs have no glob processing because they know that globbing is 
built
into the shell (like bash or csh, or dash, etc).

If you run 'ls' *.pdf in bash, bash expands the *.pdf into arguments
that don't contain a glob (if the glob matches a file).  So 'ls' sees
only fixed filenames and no globs.

When you run 'ls from the Windows shell, Windows cmd.exe doesn't expand
glob chars into anything.  so 'ls' sees a literal file name of '*.pdf'.

On linux you can name a file '*.pdf' (using an asterisk as a valid 
character).
Unless you have a file named, literally '*.pdf', ls won't see it.

Cygwin does simulate this: example:
>  cd /tmp
/tmp> touch \*.pdf
/tmp> ls *.pdf
*.pdf
/tmp cmd
Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\tmp>ls *.pdf
ls *.pdf
'*.pdf'

^^ note that now windows find *.pdf because there is a file named '*.pdf'
(quotes added by 'ls').

Does this explain your issue, or am I not understanding it?

Thanks (I'm not a cygwin author; just answering the question)
Linda

> Steps to Reproduce:
> * Have some files in the local director with accented characters in the names, e.g.:
> C:> mkdir c:\temp\test
> C:> cd c:\temp\test
> C:> touch h�llo.pdf
> C:> touch g�odbye.pdf
> C:> touch normal.pdf
> * DON'T have the LANG= environment variable set to anything
> * NOT in bash or Cygwin Terminal, but rather within Windows CMD.exe, execute a Cygwin command which needs to do file name globbing because the Windows CMD.exe shells does not do so for it, e.g.
> C:> ls *.pdf
> C:> cat *.pdf
> These will produce "ls: cannot access '*.pdf': No such file or directory"
> Although, curiously,
> C:> ls *or*
> does correctly produce:
> normal.pdf
>
> Also, display output of the �cc�nted characters is incomplete:
> C:> ls
> 'g'$'\303\262''odbye.pdf'  'h'$'\303\251''llo.pdf'   normal.pdf
> C:> bash
> jay_l@DESKTOP-I9MRIE3 /cygdrive/c/Temp
> $ ls
> 'g'$'\303\262''odbye.pdf'  'h'$'\303\251''llo.pdf'   normal.pdf
>
>
> Analysis:
> I've verified that it's not about case sensitivity. That is, it's not a matter of ls *.pdf vs. ls *.PDF.
> If these test commands are run either under bash.exe or within a Cygwin Terminal window, the problem does not occur.
> I've verified that the Windows system locale (per Windows' Region setting) actually doesn't matter. (I've reproduced this both on systems in Region Spain with language English-International and English-Ireland, and in a VM with a bog standard vanilla US English Windows).
>
> Credits to Paul for suggesting deleting files one by one until the problem goes away, and to Andrey for pointing out `locale` and the LANG= setting.
>
> Set LANG=en_US.UTF-8, e.g.
> C:> set LANG=en_US.UTF-8
> .. and the problem goes away.
> C:> ls *.pdf
> g�odbye.pdf
> h�llo.pdf
> normal.pdf
> C:> ls
> g�odbye.pdf
> h�llo.pdf
> normal.pdf
>
> Interestingly, Andrey mentioned that he sets LANG=ru_RU.CP866 and he doesn't see the problem. When I tried that exact setting, I still had the problem.
> So it's maybe not just that LANG must be set to *something*, but that somehow LANG must be set to something that matches something in Windows? (Sorry, I know that's nearly uselessly vague).
>
>
> In summary, it appears that the way that the argv[] globbing code which gets compiled in to Cygwin programs functions a bit differently than the way the shell globbing code works within bash.exe.
> And this produces unexpected globbing failures.
>
>
> Thanks to all the Cygwin maintainers for this amazing software, for so many years!
> -Jay
>
>
>   
> ------------------------------------------------------------------------
>
> --
> Problem reports:      https://cygwin.com/problems.html
> FAQ:                  https://cygwin.com/faq/
> Documentation:        https://cygwin.com/docs.html
> Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple
>   



More information about the Cygwin mailing list