This is the mail archive of the cygwin-developers mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Filenames with Win32 special characters (or: Interix filename compatibility)


Corinna Vinschen wrote:

> We could enhance the method to handle uppercase ASCII chars as well.
> Managed mounts could use the same method as normal mounts, just with
> upper case ASCII chars transformed, too.
> 
> This would have the additional advantage that filenames on managed
> mounts not only look almost normal, the length of the real path
> also isn't changed due to the char transformation, like it is today.

Interesting.  The unchanged length sounds nice, but I'm not sure I
follow about looking almost normal.  Any filename with uppercase
characters would still look unintelligible in Explorer/any ANSI Win32
app, wouldn't it?

Here's an alternative idea for the encoding.  What if we encode upper
case letters as themselves plus a rare combining entity?  For example,
there's a block U+FE00 - U+FE0F called simply VARIATION SELECTOR-1
through VARIATION SELECTOR-16:
<http://www.fileformat.info/info/unicode/block/variation_selectors/list.htm>.  

*experiments*

Well crap, those don't work very well, they display as boxes rather than
combining.  But going through the entire list of combining characters, I
did find one with an interesting property: U+0331: COMBINING MACRON
BELOW.  When displayed in Explorer, it looks like the normal letter with
a small underline.  But the neat property of this character is that when
converted from Unicode to cp1252 it converts to the underscore, meaning
stupid ANSI programs can still edit/open/save these files.  So we'd
encode uppercase ascii as simply 'A' -> "A\x0331", 'B' -> "B\x0331" and
so on.  It doesn't have the property of the same length, but they still
remain intelligible in dumb apps.

(BTW, for a real hoot try creating a filename containing U+034F
COMBINING GRAPHEME JOINER.)

Brian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]