This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: The C locale


On Sep 24 11:57, Corinna Vinschen wrote:
> On Sep 24 18:37, IWAMURO Motonori wrote:
> > - CP932 (Shift_JIS) has 1byte character and 2bytes character.
> > 
> > - The range of 1byte character is 0x00-0x7F and 0xA0-0xDF.
> > 
> > - The range of first byte of 2byte character is 0x80-0x9F and 0xE0-0xFC.
> > 
> > - The range of second byte of 2byte character is 0x40-7E and 0x80-0xFC.
> >   This includes "[", "\", "]", "^", "`", "{", "|", "}".
> 
> Ok, thanks for your examples, they show neatly where the problem is.
> 
> As you might know, the codepage 20932 (EUC-JP) is also not the same
> as the UNIX EUC_JP implementation.  The JIS-X-0212 three byte codes
> are folded into two-byte sequences as described in a comment in
> strfuncs.cc:
> 
>   /* Unfortunately, the Windows eucJP codepage 20932 is not really 100%
>      compatible to eucJP.  It's a cute approximation which makes it a
>      doublebyte codepage.
>      The JIS-X-0212 three byte codes (0x8f,0xa1-0xfe,0xa1-0xfe) are folded
>      into two byte codes as follows: The 0x8f is stripped, the next byte is
>      taken as is, the third byte is mapped into the lower 7-bit area by
>      masking it with 0x7f.  So, for instance, the eucJP code 0x8f,0xdd,0xf8
>      becomes 0xdd,0x78 in CP 20932.
> 
>      To be really eucJP compatible, we have to map the JIS-X-0212 characters
>      between CP 20932 and eucJP ourselves. */
> 
> My question is this:  Is the S-JIS implementation on UNIX systems
> also using a different implementation to avoid using characters
> from the ASCII range?  If so, can't we change the __sjis_wctomb
> and __sjis_mbtowc functions in the same manner as the __eucjp_wctomb
> and __eucjp_mbtowc functions to get a safer implementation?

Hmm, as far as I can see from wikipedia, S-JIS is simply defined
that way.  Bah.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]