This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

bug in EUC-JP converter and charmap



The iconv converter for EUC-JP converts 0xA1C0 to U+FF3C and 0x8FA2B7 to
U+FF5E, and the EUC-JP charmap table lacks both mappings.

But these two mappings are wrong:
Unicode.org's Mappings/EASTASIA/JIS/JIS0208.TXT maps the first one to U+005C.
Unicode.org's Mappings/EASTASIA/JIS/JIS0212.TXT maps the second one to U+007E.

Here is a patch to
  - Change the iconv converter accordingly.
    (Yes the roundtrip EUC-JP -> Unicode -> EUC-JP will change
     0xA1C0 to 0x5C and 0x8FA2B7 to 0x7E but this is not a problem because
     0x5C and 0x7E are unambiguously the "REVERSE SOLIDUS" and "TILDE" in
     EUC-JP.)
  - Add commented lines to the EUC-JP charmap which the testsuite will
    recognize.


2000-09-03  Bruno Haible  <haible@clisp.cons.org>

	* iconvdata/jis0208.c (__jis0208_to_ucs): Map EUC-JP 0xA1C0 to U+005C.
	* iconvdata/jis0212.c (__jisx0212_to_ucs): Map EUC-JP 0x8FA2B7 to
	U+007E.

2000-09-03  Bruno Haible  <haible@clisp.cons.org>

	* charmaps/EUC-JP: Nonreversibly map 0xA1C0 to U+005C and 0x8FA2B7 to
	U+007E.

*** glibc-20000831/iconvdata/jis0208.c.bak	Tue Sep  7 16:50:40 1999
--- glibc-20000831/iconvdata/jis0208.c	Sun Sep  3 11:57:35 2000
***************
*** 67,73 ****
    [0x0010] = 0xffe3, [0x0011] = 0xff3f, [0x0012] = 0x30fd, [0x0013] = 0x30fe,
    [0x0014] = 0x309d, [0x0015] = 0x309e, [0x0016] = 0x3003, [0x0017] = 0x4edd,
    [0x0018] = 0x3005, [0x0019] = 0x3006, [0x001a] = 0x3007, [0x001b] = 0x30fc,
!   [0x001c] = 0x2015, [0x001d] = 0x2010, [0x001e] = 0xff0f, [0x001f] = 0xff3c,
    [0x0020] = 0x301c, [0x0021] = 0x2016, [0x0022] = 0xff5c, [0x0023] = 0x2026,
    [0x0024] = 0x2025, [0x0025] = 0x2018, [0x0026] = 0x2019, [0x0027] = 0x201c,
    [0x0028] = 0x201d, [0x0029] = 0xff08, [0x002a] = 0xff09, [0x002b] = 0x3014,
--- 67,73 ----
    [0x0010] = 0xffe3, [0x0011] = 0xff3f, [0x0012] = 0x30fd, [0x0013] = 0x30fe,
    [0x0014] = 0x309d, [0x0015] = 0x309e, [0x0016] = 0x3003, [0x0017] = 0x4edd,
    [0x0018] = 0x3005, [0x0019] = 0x3006, [0x001a] = 0x3007, [0x001b] = 0x30fc,
!   [0x001c] = 0x2015, [0x001d] = 0x2010, [0x001e] = 0xff0f, [0x001f] = 0x005c,
    [0x0020] = 0x301c, [0x0021] = 0x2016, [0x0022] = 0xff5c, [0x0023] = 0x2026,
    [0x0024] = 0x2025, [0x0025] = 0x2018, [0x0026] = 0x2019, [0x0027] = 0x201c,
    [0x0028] = 0x201d, [0x0029] = 0xff08, [0x002a] = 0xff09, [0x002b] = 0x3014,
*** glibc-20000831/iconvdata/jis0212.c.bak	Tue Sep  7 16:50:46 1999
--- glibc-20000831/iconvdata/jis0212.c	Sun Sep  3 11:58:09 2000
***************
*** 111,117 ****
  const uint16_t __jisx0212_to_ucs[] =
  {
    0x02d8, 0x02c7, 0x00b8, 0x02d9, 0x02dd, 0x00af, 0x02db, 0x02da,
!   0xff5e, 0x0384, 0x0385, 0x00a1, 0x00a6, 0x00bf, 0x00ba, 0x00aa,
    0x00a9, 0x00ae, 0x2122, 0x00a4, 0x2116, 0x0386, 0x0388, 0x0389,
    0x038a, 0x03aa, 000000, 0x038c, 000000, 0x038e, 0x03ab, 000000,
    0x038f, 000000, 000000, 000000, 000000, 0x03ac, 0x03ad, 0x03ae,
--- 111,117 ----
  const uint16_t __jisx0212_to_ucs[] =
  {
    0x02d8, 0x02c7, 0x00b8, 0x02d9, 0x02dd, 0x00af, 0x02db, 0x02da,
!   0x007e, 0x0384, 0x0385, 0x00a1, 0x00a6, 0x00bf, 0x00ba, 0x00aa,
    0x00a9, 0x00ae, 0x2122, 0x00a4, 0x2116, 0x0386, 0x0388, 0x0389,
    0x038a, 0x03aa, 000000, 0x038c, 000000, 0x038e, 0x03ab, 000000,
    0x038f, 000000, 000000, 000000, 000000, 0x03ac, 0x03ad, 0x03ae,
*** glibc-20000831/localedata/charmaps/EUC-JP.bak	Wed Jul 12 18:11:45 2000
--- glibc-20000831/localedata/charmaps/EUC-JP	Sun Sep  3 12:01:03 2000
***************
*** 276,281 ****
--- 276,282 ----
  <U2015>     /xa1/xbd     HORIZONTAL BAR
  <U2010>     /xa1/xbe     HYPHEN
  <UFF0F>     /xa1/xbf     FULLWIDTH SOLIDUS
+ %IRREVERSIBLE%<U005C>     /xa1/xc0     REVERSE SOLIDUS
  <U301C>     /xa1/xc1     WAVE DASH
  <U2016>     /xa1/xc2     DOUBLE VERTICAL LINE
  <UFF5C>     /xa1/xc3     FULLWIDTH VERTICAL LINE
***************
*** 7135,7140 ****
--- 7136,7142 ----
  <U00AF>     /x8f/xa2/xb4 MACRON
  <U02DB>     /x8f/xa2/xb5 OGONEK
  <U02DA>     /x8f/xa2/xb6 RING ABOVE
+ %IRREVERSIBLE%<U007E>     /x8f/xa2/xb7 TILDE
  <U0384>     /x8f/xa2/xb8 GREEK TONOS
  <U0385>     /x8f/xa2/xb9 GREEK DIALYTIKA TONOS
  <U00A1>     /x8f/xa2/xc2 INVERTED EXCLAMATION MARK

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]