bug/deficiency in unzip: incompatible with other programs when entry path names have non-ascii chars

Brent yhbrent@yahoo.com
Wed Nov 5 02:13:00 GMT 2014

(Note subject edit to be more accurate)

>On 2014-11-04 12:17, Yaakov wrote:

>>On 2014-11-03 21:14, Brent wrote:
>>Any thoughts on the bug that I found with cygwin unzip regarding its unicode handling?
>>In particular, cygwin unzip seems to work with cygwin zip, but cannot extract archives produced by multiple other mainstream zip programs.
>>My last email detailing this is
>>    https://cygwin.com/ml/cygwin/2014-11/msg00023.html
>Have you tried this again with 6.0-11 too? Unless I'm doing something wrong, I can't reproduce your error with it.

For sure: after I updated cygwin the other day to test the large file fix, I picked up unzip version 6.0-11.  So it is what I am now using via cygwin.

I then reran my complete test suite.  Everything now works except the part of the test where cygwin unzip is to extract a zip file produced by Java.  This particular zip file has entries whose path names are non-ASCII chars.  I have manually verified that this zip file is perfectly extractable by 7zip and WinZip, so Java does not seem to be the problem.

I would gladly attach the zip file to this email, but this mailing list does not seem to like attachments.

So, I am trying a free file upload service.  My archive, test.zip, should be downloadable from here:

Read the File description in the URL above too.

I am looking forward to what you find.

>On 2014-11-05 03:51, Andrey wrote:
>Can this be related to locale settings?
>I didn't see Brent mentioning his locale settings, though.

I have never done any configuration after installing cygwin.  In particular, I have never mucked with any locale settings.  So whatever the default install gives is what I have.  (Unless cygwin draws on what my Windows locale settings are?)

I had to look up what locale settings cygwin even offers.  This seems to be a good link:

One claim in that link is that cygwin only cares about these 3 env vars: LC_ALL, LC_CTYPE, and LANG.  Here is what they are on my system:
    $ echo "LC_ALL = $LC_ALL"
    LC_ALL =

    $ echo "LC_CTYPE = $LC_CTYPE"
    LC_CTYPE =

    $ echo "LANG = $LANG"
    LANG = en_US.UTF-8

Andrey, is this what you are looking for, or do you need something else?

Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

More information about the Cygwin mailing list