This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line


IWAMURO Motonori wrote:
The encoding of C locale is ASCII, and not ISO-8859-1.
I don't think ASCII is the same as ISO-8859-1.
Does it work on LANG=en_US.ISO-8859-1?

No, it doesn't. Mind you though, I haven't managed to get piconv to recognize any of my LANG settings other than C in cygwin 1.7.


$ export LANG=LANG=en_US.ISO-8859-1

$ piconv
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
        LC_ALL = (unset),
        LANG = "LANG=en_US.ISO-8859-1"
    are supported and installed on your system.

(... usage omitted...)

$ ./bug arg1 "before `cat copyright.txt` after" arg3
0: E:\cygwin1.7\tmp\bug.exe
1: arg1
2: before

Regards,
-Edward

2009/5/29 Edward Lam <edward@sidefx.com>:
Alexey Borzenkov wrote:
On Thu, May 28, 2009 at 7:28 PM, Edward Lam <edward@sidefx.com> wrote:
PS. In case you haven't noticed, copyright.txt is not a long file. It
consists of a single byte, 0xA9.
Did you try utf-8 encoding copyright.txt? Perhaps your locale is utf-8
and the encoder fails.
How is one supposed to determine one's locale in cygwin? I do NOT have LANG,
or any of the LC environment variables set. I even tried explicitly setting
LANG=C and it still fails.

The problem does seem to stem from the new UTF-8 support in cygwin 1.7.
However, I think something is going on here that is unexpected because
trying something similar on Linux has no problems. To confirm that it was an
UTF-8 related problem, let me repeat the steps slightly differently again.
Here we assume that I've already got bug.exe compiled which simply prints
out its arguments.

$ export LANG=C

$ ./bug arg1 "before `cat copyright.txt` after" arg3
0: E:\cygwin1.7\tmp\bug.exe
1: arg1
2: before

*Notice that argc is 3 when it should be 4!*

$ piconv -f iso-8859-1 -t utf8 < copyright.txt > fubar.txt

$ ./bug arg1 "before `cat fubar.txt` after" arg3
0: E:\cygwin1.7\tmp\bug.exe
1: arg1
2: before © after
3: arg3

*So now everything works because I converted the character into UTF-8.*

I think what this points to is some form of invalid source encoding of the
command line argument when spawning NATIVE applications.

Here's what happens when I try to compile bug.c using cygwin's gcc:

$ gcc bug.c -o bug-gcc.exe

$ ./bug-gcc arg1 "before `cat copyright.txt` after" arg3
0: ./bug-gcc
1: arg1
2: before © after
3: arg3

So there seems to be some sort of special marshaling of the command line
arguments that only works when spawning cygwin apps, but breaks when running
under native apps.

Regards,
-Edward

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/







--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]