Can not stat file with utf char U+F020

Gionatan Danti g.danti@assyoma.it
Mon Apr 17 13:46:52 GMT 2023


Il 2023-04-17 11:05 Corinna Vinschen ha scritto:
> It's actually not the "dos" mount option but specific filesystems
> which trigger the conversion from U+0020 to U+F020.

OK.

> However, the conversion back is handled in a piece of code which has
> no information about the underlying filesystem, so the F0xx -> 00xx
> conversion is done all the time.  Adding filesystem info in this
> place is really tricky.

Ah, I missed it, thanks! With these new information, I did some 
progress.

First, I use the "dos" mount option to always trigger conversion of 
space and dot at filename end into F+00xx chars. Now I am able to create 
such strange-looking file (in Explorer) within cygwin itself. For 
example, touch "zzs " now results in "zzs+strangechar" in Explorer. Both 
cygwin and windows are able to read/write such file.

But if I edit the filename via Explorer adding an extension (ie: from 
"zzs+strangechar" to "zzs+strangechar.txt") now cygwin is suddenly 
unable to read/write the file.

It seems to me that the appended chars prevent cygwin to translate back 
F0xx to 00xx (as the PUA char is not at the end of the filename 
anymore).

So, two paths should be available:
- always translate back F0xx to 00xx even if not at the end of filename;
- otherwise, if too invasive to do it unconditionally, add an option as 
"always_translate_pua" (default: off) to enable such behavior based on 
user needs.

I would (naively?) think that option 1 (always translate back PUA) 
should be the preferred approach, as cygwin is at the moment effectively 
unable to access some files.

Regards.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8


More information about the Cygwin mailing list