This is the mail archive of the newlib@sourceware.org mailing list for the newlib project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: codeset problems in wprintf and wcsftime


On 2 March 2010 20:47, Corinna Vinschen:
>> - Add another flag __HAVE_WCHAR_LOCALE_INFO__ or something. ÂUsed in
>> Â vfwprintf and wcsftime it would use different code for targets
>> Â which only have the multibyte locale info and targets which have
>> Â also the wide char representation.
>
> Attached is a patch which does that and more. ÂI do not intend to apply
> this patch before the next Cygwin release 1.7.2 is out (which will be
> RSN), but I wanted to show what I have, so you guys can make a sanity
> check on the new code if you're interested in this stuff.
>
> Here's what the patch does, in random order:
>
> - Add a new macro called __HAVE_LOCALE_INFO_EXTENDED__ which guards
> Â(almost) all of the new locale definitions.
>
> - Add two files locale/lctype.[ch], which define a basic LC_CTYPE
> Âdatastructure and functions as in the other categories.
>
> ÂThis adds two features. ÂOne of them is for all targets.
>
> Â- The global variables lc_ctype_charset, __mb_cur_max, and
> Â Âlc_message_charset are no longer used on targets defining
> Â Â__HAVE_LOCALE_INFO__. ÂThe reason is that global variables disallow
> Â Âto define thread-local locales. ÂThese are a GNU extension which
> Â Âhave been added to the latest POSIX-1.2008 standard, see
> Â Âuselocale(3).
>
> Â ÂThe replacement are codeset and mb_cur_max members in the LC_CTYPE
> Â Âstructure, and codeset in the LC_MESSAGES structure.
>
> Â ÂThis affects especially the MB_CUR_MAX definition in stdlib.h. ÂIt's
> Â Ânot any longer just __mb_cur_max, rather it calls the new
> Â Â__locale_mb_cur_max() function.

I'm concerned about the level of indirection that this entails.

Leaving out the ifdef's, we've got:

int
_DEFUN_VOID(__locale_mb_cur_max)
{
  return __get_current_ctype_locale ()->mb_cur_max;
}

Actually, I think there's a bug here: __get_current_ctype_locale()
returns a pointer to lc_ctype_T struct, and the mb_cur_max field in
that is of type 'const char *'. Looking at the code in
__ctype_load_locale(), this is intentional, because apparently it's
meant to point into a char array pointed to by _ctype_locale_buf. So I
think there's a '*' missing in the function above.

But I'm puzzled about why that indirection into _ctype_locale_buf is
there in the first place. Why not just store mb_cur_max directly in
the lc_ctype_T struct?

The __ctype_locale_buf also holds the name of the selected charset,
and I don't like the charset and mb_cur_max being mixed up in that
way, and I think this is really ugly:

/* Max encoding_len + NUL byte + 1 byte mb_cur_max plus trailing NUL byte */
#define _CTYPE_BUF_SIZE	34
static char _ctype_locale_buf[_CTYPE_BUF_SIZE];

What's the idea behind that? And indeed, why not store the codeset
string directly in an array in lc_ctype_T?

Also, in __get_current_ctype, there's this:

struct lc_ctype_T *
__get_current_ctype_locale(void) {
	return (_ctype_using_locale
		? &_ctype_locale
		: (struct lc_ctype_T *)&_C_ctype_locale);
}

Checking whether we're in the "C" locale every single time seems
rather inefficient. Wouldn't it be better to initialise _ctype_locale
with the contents _C_ctype_locale and later copy it in again when the
C locale is selected?

Andy


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]