This is the mail archive of the cygwin-developers mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Filenames with Win32 special characters (or: Interix filename compatibility)


On Mar 11 09:55, Brian Dessent wrote:
> Corinna Vinschen wrote:
> 
> > We could enhance the method to handle uppercase ASCII chars as well.
> > Managed mounts could use the same method as normal mounts, just with
> > upper case ASCII chars transformed, too.
> > 
> > This would have the additional advantage that filenames on managed
> > mounts not only look almost normal, the length of the real path
> > also isn't changed due to the char transformation, like it is today.
> 
> Interesting.  The unchanged length sounds nice, but I'm not sure I
> follow about looking almost normal.  Any filename with uppercase
> characters would still look unintelligible in Explorer/any ANSI Win32
> app, wouldn't it?
> 
> Here's an alternative idea for the encoding.  What if we encode upper
> case letters as themselves plus a rare combining entity?  For example,
> there's a block U+FE00 - U+FE0F called simply VARIATION SELECTOR-1
> through VARIATION SELECTOR-16:
> <http://www.fileformat.info/info/unicode/block/variation_selectors/list.htm>.  
> 
> *experiments*
> 
> Well crap, those don't work very well, they display as boxes rather than
> combining.  But going through the entire list of combining characters, I
> did find one with an interesting property: U+0331: COMBINING MACRON
> BELOW.  When displayed in Explorer, it looks like the normal letter with
> a small underline.  But the neat property of this character is that when
> converted from Unicode to cp1252 it converts to the underscore, meaning
> stupid ANSI programs can still edit/open/save these files.  So we'd
> encode uppercase ascii as simply 'A' -> "A\x0331", 'B' -> "B\x0331" and
> so on.  It doesn't have the property of the same length, but they still
> remain intelligible in dumb apps.
> 
> (BTW, for a real hoot try creating a filename containing U+034F
> COMBINING GRAPHEME JOINER.)

This approach sounds quite interesting!  Right now I can't test because
I broke almost everything in my local copy when I started to implement
the new cygwin_conv_path API.  I still have some debugging left for
tomorrow...  As soon as that stuff works, I will have another look
into that.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]