Invalid tm_zone from localtime() when TZ is not set

Hans-Bernhard Bröker HBBroeker@t-online.de
Thu May 26 11:03:00 GMT 2016


Am 25.05.2016 um 10:44 schrieb Corinna Vinschen:
> On May 25 11:28, KOBAYASHI Shinji wrote:
>>
>> Any other comments on this topic? Let me explain my proposal again.
>>
>> The intention of the following code in tzsetwall() should be to pick
>> up UPPERCASE letters "in ASCII range":

Are you sure you're not mixing ASCII with '8-bit character' range there?

>> if (isupper(*src)) *dst++ = *src;
>>
>> NOTE: src is wchar_t *, dst is char *.
>>
>> As Csaba Raduly pointed out, isw*() functions should be the first
>> choice if they achieve the desired behavior (select uppercase AND
>> ASCII).

But it doesn't, so it's not.

>> However, iswupper() does not fit for this purpose, as it
>> returns 1 for L'\uff21' for example. And I could not find isw*()
>
> In that case, wouldn't it make sense to fix iswupper in the first place?

I don't believe it's been shown to be broken, so there's no need to fix it.

> Apart from that, we can workaround all problems in tzsetwall by just
> checking for
>
>   if (*src >= L'A' && *src <= L'Z')

While that may be possible if it really is ASCII you're looking for, 
it's perverting the whole reason <ctype.h> and <wctype.h> exist: to make 
tests like this as independent of the actual character encoding as possible.

Here's what I wrote last week, but apparently only to Csaba Raduli:

Am 20.05.2016 um 09:09 schrieb Csaba Raduly:

 > If the type of those members is WCHAR[] then using isascii() /
 > isupper() on them is just plain wrong.

Absolutely.  The argument type of isupper() and friends is 'int', not 
'unsigned char'.  But the _only_ allowed argument values are those in 
the range of unsigned char, plus EOF.  For typical systems, that means 
the allowed argument range of is*() is -1 ... 255 inclusive.  Calling 
these Standard Library functions with any other argument causes 
undefined behaviour.

That leaves three sensible ways of calling isupper() in portable code:

*) isupper(foo)  # where type of foo is unsigned char
*) isupper((unsigned char)bar) # where bar is signed char, or plain char
*) isupper(baz) # where baz was got from fgetc() or similar

All other call patterns are plain and simply wrong, or at least 
non-portable.  In particular, passing a wchar_t to any of the <ctype.h> 
function is wrong every time.

 > The correct function to use would be iswupper().

Actually, the is*upper() isn't even the actual problem here.  The whole 
idea of copying a wchar_t string into a char one, element by element, is 
most likely nonsensical.  A wchar_t cannot be assumed to just fit into a 
char, regardless whether iswupper() returned true on it or not.  E.g. 
what do we expect this to do with an upper-case Greek or Cyrillic letter?

A proper solution may have to be more like this:

     int mapped = wctob(*src);
     /* this call is safe now because of how wctob() works: */
     if (isupper(mapped)) {
        *dst++ = (unsigned char)mapped;
     }

 >> So, I propose to call isascii() to assure the wchar_t fits in the
 >> range of ASCII before calling isupper().

Calling isascii() would be wrong for the same reasons calling isupper() is.




--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple



More information about the Cygwin mailing list