Thu May 11 04:48:00 GMT 2006
Thomas Wolff wrote:
> Sorry for the very late response, but I've finally succeessfully
> pursuaded rxvt-unicode now to actually support Unicode on cygwin,
> and I'd like to suggest to include that in the package.
That's great, thank you very much. I received your other emails and
will take a look as soon as possible. However, I'll let the brand new
(not even announced yet) rxvt-unicode-X package stay as-is for a while
to give folks a chance to try it out before incorporating any new
> Some general remarks:
> Depending on the application, Unicode may be triggered either
> 1) explicitly or
> 2) using the locale mechanism (which is bogus on cygwin).
> It should be noted that the set of locale variables (LC_* and LANG)
> are not identical to the locale mechanism which needs addtional
> library support.
> 1) For example, xterm has an explicit command line option:
> xterm -u8
> which invokes xterm in UTF-8 mode. Additional configuration is
> needed to use Unicode fonts. And LC_* variables are unfortunately
> not set implicitly in this invocation mode which confuses many
> My package mined includes a script uterm which invokes xterm in a
> suitable mode, including font setup. Cygwin/X does include some
> Unicode fonts, but apparently a very outdated version of them with
> a very limited character range. I would offer to maintain a package
> of Unicode X fonts if that helps.
> 2) Rxvt insists on locale configuration to provide desired encodings.
> This means, you would have to invoke rxvt like this:
> LC_CTYPE=en_US.UTF-8 rxvt
> LC_ALL=vi_VN rxvt
> (Note: vi_VN is one of the UTF-8 locales that lack the usual
> indication suffix.)
> And rxvt would run in UTF-8 mode where the locale mechanism
> works (which it doesn't on cygwin).
So, you're saying that rxvt-unicode doesn't have an explicit switch, but
relies on pre-existing env vars. This is good, because the apps one
runs IN the terminal will need those env vars too, something a command
line switch won't set for you properly anyway.
> The reason why I couldn't trick out rxvt before by just setting the
> variables was that it also depends on the wide character library
> functions which in turn depend on a working locale mechanism.
if the wide char library functions don't exist, then rxvt ignores the LC
vars anyway. Gotcha.
> I have now replaced those functions (well, the subset of them needed
> by rxvt) with substitutes that either operate in UTF-8 mode, or
> delegate to the system functions, depending on the setting of the
> locale variables, and it works.
Shims -- that's a reasonable approach. (I'd prefer if unicode/locale
support were added to cygwin's version of newlib but that might be
Augean Stables-level of effort.) OTOH, I *really* prefer
things-that-work, sooner rather than later -- so this is good.
> At least it does so for display,
> although it suppresses 8-bit input for some obscure reason still to be
I'm just guessing, but this could be related to the configure settings
in my build script, if that's what you were using:
--enable-shared --enable-utmp --enable-wtmp --enable-lastlog \
--enable-xft --enable-font-styles --disable-xim --enable-combining \
--enable-fallback=Rxvt --with-res-name=urxvt --with-res-class=URxvt \
--enable-xpm-background --enable-menubar --enable-rxvt-scroll \
--enable-next-scroll --enable-xterm-scroll --enable-plain-scroll \
--enable-transparency --enable-tinting --enable-fading \
--enable-frills --enable-smart-resize --enable-pointer-blank \
--enable-mousewheel --enable-slipwheeling --enable-keepscrolling \
--enable-old-selection --disable-perl \
Note: --disable-xim as well as not specifying --enable-8bitctrls
Now, the latter is "not recommended" and its only effect is the
following block of code in the input-processing loop:
// 8-bit controls
case 0x90: /* DCS */
case 0x9b: /* CSI */
case 0x9d: /* CSI */
So, I don't think that's it.
While 8bit input != xim, there are two things I've discovered about the
(1) very little testing is done in non-default configurations (and
--enable-xim is the default)
(2) some #define macros turn on/turn off more than their simple names
and descriptions might suggest -- and the code often makes unwarranted
assumptions (e.g. see earlier thread about an unwarranted linkage
between transparency and XPM support)
So, it's possible that --disable-xim turns off some non-XIM input
support needed for 8bit entry.
Also, try the iso14755 support (CTRL-SHFT-key). Maybe that helps?
Finally, input is a cooperative affair between the terminal, the shell,
and for X11 terminals, the Xserver. In the case of bash, that also
includes readline. How's your ~/.inputrc set up?
# don't strip characters to 7 bits when reading
set input-meta on
# allow iso-latin1 characters to be inserted rather
# than converted to prefix-meta sequences
set convert-meta off
# display characters with the eighth bit set directly
# rather than as meta-prefixed characters
set output-meta on
Also, are you sure that the "meta" key is what you think it is? You can
force it by using the -mod cmdline option of rxvt-unicode (see that
urxvt manpage). I think the cygwin Xserver defaults to using Alt.
And then, there's the -meta8 cmdline option to rxvt-unicode:
True: handle Meta (Alt) + keypress to set the 8th bit.
False: handle Meta (Alt) + keypress as an escape prefix
[False is default].
Maybe you want True?
> I will send the files to you (Charles Wilson) directly and would
> appreciate if you confirm the solution.
Quick perusal looks pretty good. I like the caching of is_u_utf8_mode,
but you should watch out: --enable-frills turns on
'locale switching escape sequence'
so you might need to add a hook in that handler to "un-cache".
More information about the Cygwin-apps