ITP: rxvt-unicode-X

Charles Wilson cygwin@cwilson.fastmail.fm
Thu May 11 04:48:00 GMT 2006


Thomas Wolff wrote:
> Sorry for the very late response, but I've finally succeessfully 
> pursuaded rxvt-unicode now to actually support Unicode on cygwin, 
> and I'd like to suggest to include that in the package.

That's great, thank you very much.  I received your other emails and 
will take a look as soon as possible.  However, I'll let the brand new 
(not even announced yet) rxvt-unicode-X package stay as-is for a while 
to give folks a chance to try it out before incorporating any new 
features/changes.

> Some general remarks:
> Depending on the application, Unicode may be triggered either
> 1) explicitly or
> 2) using the locale mechanism (which is bogus on cygwin).
>    It should be noted that the set of locale variables (LC_* and LANG) 
>    are not identical to the locale mechanism which needs addtional 
>    library support.
> 
> 1) For example, xterm has an explicit command line option:
> 	xterm -u8
>    which invokes xterm in UTF-8 mode. Additional configuration is 
>    needed to use Unicode fonts. And LC_* variables are unfortunately 
>    not set implicitly in this invocation mode which confuses many 
>    applications.
> 
>    My package mined includes a script uterm which invokes xterm in a 
>    suitable mode, including font setup. Cygwin/X does include some 
>    Unicode fonts, but apparently a very outdated version of them with 
>    a very limited character range. I would offer to maintain a package 
>    of Unicode X fonts if that helps.
> 
> 2) Rxvt insists on locale configuration to provide desired encodings.
>    This means, you would have to invoke rxvt like this:
> 	LC_CTYPE=en_US.UTF-8 rxvt
>    or
> 	LC_ALL=vi_VN rxvt
>    (Note: vi_VN is one of the UTF-8 locales that lack the usual 
>    indication suffix.)
>    And rxvt would run in UTF-8 mode where the locale mechanism 
>    works (which it doesn't on cygwin).

So, you're saying that rxvt-unicode doesn't have an explicit switch, but 
relies on pre-existing env vars.  This is good, because the apps one 
runs IN the terminal will need those env vars too, something a command 
line switch won't set for you properly anyway.

BUT...

> The reason why I couldn't trick out rxvt before by just setting the 
> variables was that it also depends on the wide character library 
> functions which in turn depend on a working locale mechanism.

if the wide char library functions don't exist, then rxvt ignores the LC 
vars anyway.  Gotcha.

> I have now replaced those functions (well, the subset of them needed 
> by rxvt) with substitutes that either operate in UTF-8 mode, or 
> delegate to the system functions, depending on the setting of the 
> locale variables, and it works. 

Shims -- that's a reasonable approach.  (I'd prefer if unicode/locale 
support were added to cygwin's version of newlib but that might be 
Augean Stables-level of effort.) OTOH,  I *really* prefer 
things-that-work, sooner rather than later -- so this is good.

> At least it does so for display, 
> although it suppresses 8-bit input for some obscure reason still to be 
> found.

I'm just guessing, but this could be related to the configure settings 
in my build script, if that's what you were using:

   --enable-shared --enable-utmp --enable-wtmp --enable-lastlog \
   --enable-xft --enable-font-styles --disable-xim --enable-combining \
   --enable-fallback=Rxvt --with-res-name=urxvt --with-res-class=URxvt \
   --program-suffix=-X \
   --enable-xpm-background  --enable-menubar --enable-rxvt-scroll \
   --enable-next-scroll --enable-xterm-scroll --enable-plain-scroll \
   --enable-transparency --enable-tinting --enable-fading \
   --enable-frills --enable-smart-resize --enable-pointer-blank \
   --enable-mousewheel --enable-slipwheeling --enable-keepscrolling \
   --enable-old-selection --disable-perl \
   --with-xpm-includes=/usr/X11R6/include 
--with-xpm-library=/usr/X11R6/lib \
   --x-libraries=/usr/X11R6/lib


Note: --disable-xim as well as not specifying --enable-8bitctrls

Now, the latter is "not recommended" and its only effect is the 
following block of code in the input-processing loop:

#ifdef EIGHT_BIT_CONTROLS
       // 8-bit controls
       case 0x90:        /* DCS */
         process_dcs_seq ();
         break;
       case 0x9b:        /* CSI */
         process_csi_seq ();
         break;
       case 0x9d:        /* CSI */
         process_osc_seq ();
         break;
#endif

So, I don't think that's it.

=====

While 8bit input != xim, there are two things I've discovered about the 
rxvt-unicode sourcecode:
   (1) very little testing is done in non-default configurations (and 
--enable-xim is the default)
   (2) some #define macros turn on/turn off more than their simple names 
and descriptions might suggest -- and the code often makes unwarranted 
assumptions (e.g. see earlier thread about an unwarranted linkage 
between transparency and XPM support)

So, it's possible that --disable-xim turns off some non-XIM input 
support needed for 8bit entry.

Try: --enable-xim.
=====

Also, try the iso14755 support (CTRL-SHFT-key).  Maybe that helps?

=====

Finally, input is a cooperative affair between the terminal, the shell, 
and for X11 terminals, the Xserver.  In the case of bash, that also 
includes readline.  How's your ~/.inputrc set up?

      # don't strip characters to 7 bits when reading
      set input-meta on

      # allow iso-latin1 characters to be inserted rather
      # than converted to prefix-meta sequences
      set convert-meta off

      # display characters with the eighth bit set directly
      # rather than as meta-prefixed characters
      set output-meta on

Also, are you sure that the "meta" key is what you think it is?  You can 
force it by using the -mod cmdline option of rxvt-unicode (see that 
urxvt manpage).  I think the cygwin Xserver defaults to using Alt.

And then, there's the -meta8 cmdline option to rxvt-unicode:

      meta8: boolean
           True: handle Meta (Alt) + keypress to set the 8th bit.
	  False: handle Meta (Alt) + keypress as an escape prefix
	   [False is default].

Maybe you want True?

> I will send the files to you (Charles Wilson) directly and would 
> appreciate if you confirm the solution.

Quick perusal looks pretty good.  I like the caching of is_u_utf8_mode, 
but you should watch out: --enable-frills turns on
    'locale switching escape sequence'
so you might need to add a hook in that handler to "un-cache".

--
Chuck



More information about the Cygwin-apps mailing list