Unable to open files including Korean names
Jaeho Shin
netj@sparcs.kaist.ac.kr
Tue Jun 15 14:35:00 GMT 2004
On Tue, 2004-06-15 09:14:22 -0400, Pierre A. Humblet wrote:
> Thanks. Nothing conclusive.
> Could you compile and run the following one line program?
>
> #include <windows.h>
> #include <stdio.h>
>
> main()
> {
> printf("AreFileApisANSI %d\n", AreFileApisANSI());
> }
>
> Compile it with
> gcc -mno-cygwin try_ansi.c
>
> With the -mno-cygwin, the value of CYGWIN=codepage:oem
> shouldn't matter. When compiled without that switch
> codepage:oem or codepage:ansi should matter.
>
> Running on 1.5.9 is OK.
Here's the result:
$ gcc -mno-cygwin try_ansi.c
$ ./a.exe
AreFileApisANSI 1
$
>
> Also, the Korean directory name has numerical value
> ~> od -x xx.txt
> 0000000 d1c7 dbb1
>
> Do you know what encoding that is? Is it Unicode or UTF8?
> If it is UTF8, do you know what the Unicode values should be?
Well, that's in EUC-KR and CP949. CP949 has some more characters
defined in the empty areas of EUC-KR. The directory name I used,
``한글'', which is pronounced ``hangeul'' and means Korean (written
language) in Korean, is consisted of two characters:
U+D55C: Hangul syllable Hieuh A Nieun,
U+AE00: Hangul syllable Kiyeok Eu Rieul.
(Perhaps, you may be able to find it from Windows charmap)
Neither character is in CP949's extension, so they have identical values
in both EUC-KR and CP949 encoding.
Yes, you gave me the identical numerical value I use.
Running, `echo -n 한글 | od -x -` tells me:
0000000 d1c7 dbb1
Now, `echo -n 한글 | iconv -f euc-kr -t utf-8 | od -x -` tells me:
0000000 95ed ea9c 80b8
Yes, it's in EUC-KR (or CP949 equivalently in this case). I don't use
unicode environment yet. Actually, I don't know how to change encoding
from Windows. Korean version of Windows just uses CP949 as default.
Looks like od's output is in little-endian. This identifies them as
U+D55C and U+AE00, `echo -n 한글 | iconv -f euc-kr -t ucs-2 | od -x -`:
0000000 5cd5 00ae
> Thanks for your help
My pleasure. :)
BTW, is there any reason you not sending your msgs to cygwin ML?
If not, I'll just keep Cc'ing to it.
--
신재호 | Jaeho Shin <netj@sparcs.kaist.ac.kr> | http://netj.org/
System Programmers' Association for Researching Computer Systems
Division of Computer Science, Department of EECS, KAIST
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 186 bytes
Desc: not available
URL: <http://cygwin.com/pipermail/cygwin/attachments/20040615/db96f2b6/attachment.sig>
More information about the Cygwin
mailing list