[Fwd: [1.7] wcwidth failing configure tests]

Tue May 12 17:32:00 GMT 2009

On May 12 17:56, Andy Koppe wrote:
> > And here's another question. Â The utf8*.h files claim they have been
> > generated from the unicode.txt file of the Unicode 3.2 standard. Â Do we
> > have the script which generated the utf8*.h files? Â Can we regenerate
> > the files to match the current Unicode 5.1 standard?
> 
> There's Markus Kuhn's wcwidth implementation, which says it's based on
> Unicode 5.0:
> 
> http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c

This looks nice.

> Trouble is, there's the thorny issue of the "CJK Ambiguous Width"
> category of characters, which consists of things like Greek and
> Cyrillic letters as well as line drawing symbols. Those have a width
> of 1 in Western use, yet with CJK fonts they have a width of 2. That's
> why Markus Kuhn's code includes the mk_wcswidth_cjk() variant.

We should use the standard variation alone, imho.

And we need some workaround for UTF-16 systems like Cygwin.
Unfortunately, surrogate pairs only work well as part of a string, not
as standalone chars.  So wcwidth would return -1 for each single char,
but wcswidth could be tweaked to handle them gracefully.

Corinna

-- 
Corinna Vinschen
Cygwin Project Co-Leader
Red Hat

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/