Re: Invalid tm_zone from localtime() when TZ is not set

On Fri, May 20, 2016 at 6:22 AM, KOBAYASHI Shinji  wrote:
> localtime() calls tzsetwall() when TZ is not set. In tzsetwall(),
> the StandardName and DaylightName member values retrieved by
> GetTimeZoneInformation() are checked with isupper() and copied to the
> char[] buffer used as the timezone name in tzparse(). However, the
> type of these member values are wchar_t and isupper() is defined only
> when isascii() is true.

If the type of those members is WCHAR[] then using isascii() /
isupper() on them is just plain wrong.
The correct function to use would be iswupper().

The line
    if (isupper(*src)) *dst++ = *src;

(where src is wchar_t* and dst is char*) assumes that the upper 8 bits
of *src are zero (or *src is -1).
If not, the behavior is at best implementation-defined (maybe even undefined).

> So it may happen that the char[] buffer
> contains invalid characters as a result of implicit cast from wchar_t
> to char.
> The return value of isupper() for non-ascii characters depends on
> other data, because an out of bounds access occurs for the small
> (128 + 256) table used in isupper(). I confirmed the above error on
> Japanese Windows with 64-bit Cygwin 2.5.0-1 and 2.5.1-1, but had no
> problem with 64-bit Cygwin 2.4.1-1 nor with 32-bit Cygwins.
> So, I propose to call isascii() to assure the wchar_t fits in the
> range of ASCII before calling isupper().
> I have considered some other methods:
> 1. Using iswupper() instead of isupper().
>    - Although this method is effective for Japanese environments, it
>      is not assured that the character iswupper() returns true fits in
>      the range of ASCII.

It is highly likely that if the argument of iswupper() does not fit
into ASCII then its result won't fit either.

> 2. Add (char) cast to the argument of isupper().
>    - This method assures that the copied characters are uppercase
>      only. However, it may be different from original characters due
>      to casting.

