Ctrl-N sequences (was: Re: Console codepage setting via chcp?)

Corinna Vinschen corinna-cygwin@cygwin.com
Sun Sep 27 08:51:00 GMT 2009


On Sep 27 07:43, Andy Koppe wrote:
> 2009/9/26 Corinna Vinschen:
> > The \016\377\x is the multibyte
> > sequence which gets created from a lone U+DCxx UTF-16 value in
> > sys_cp_wcstombs.
> 
> Forgot to say: that makes sense.

I'm not so sure anymore.  The problem with this sequence is that it's
not valid UTF-8.  So, UTF-8 aware applications trying to display this
filename will have trouble to convert the filename to wchar_t.  Maybe
this is not such a good idea.  Given your other mail about UCS-2
sequences (http://cygwin.com/ml/cygwin-developers/2009-09/msg00065.html)
it might make sense to change that accordingly.

> Which reminds me, regarding ^N sequences I stumbled across the following issue.
> 
> ^N switches vt100-compatible terminals to the so-called G1 character
> set, away from the default G0. ^O switches back. G0 and G1 can
> independently be mapped to ASCII, linedraw or VGA (as well as various
> obsolete ASCII variations). Different escape sequences are needed to
> configure G0 and G1. (vt220 adds G2 and G3, for more such fun.)
> 
> Hence the problem with ^N being displayed as part of a filename is
> that the normally used G0 sequences for switching to the linedraw or
> VGA character sets will no longer appear to work afterwards, since G1
> is active. Less likely, if G1 happens to be set to the linedraw
> charset, a ^N will turn all lowercase characters into hieroglyphics.
> (I did experience that effect because the xgraphics test file I'd
> attached had switched G1 to the linedraw charset.)
> 
> Therefore, is there a particular reason for choosing ^N

Searching for a meaningful lead byte, I read this:
http://en.wikipedia.org/wiki/Shift_Out_and_Shift_In_characters
That's why I chose ^N.

> , or could it
> be changed to a control character that does not have a special meaning
> to terminals? The following are taken:
> 
> ^E ^G ^H ^I ^J ^K ^L ^M ^N ^O ^[
> 
> And these ones are usually taken by the pty (although they're less of
> an issue, as they're special only on input rather than output):
> 
> ^C ^D ^Q ^R ^S ^U ^V ^W ^Z ^\
> 
> Anyway, this leaves:
> 
> ^A ^B ^F ^P ^T ^X ^Y ^] ^^ ^_
> 
> All of which mean something or other to readline, but again, they're
> only special on keyboard input, not on output to the terminal screen.
> 
> How about ^X? Somewhat mnemonic, because Alt+X allows entering Unicode
> codepoints in many Windows apps.

The lead byte has only a meaning to Cygwin anyway.  ^X is as good as ^N.
I can't believe anybody has actually written a script or executable
already, which makes some use of the ^N in a filename.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat



More information about the Cygwin-developers mailing list