bug/deficiency in unzip: incompatible with other programs when entry path names have non-ascii chars

Brent yhbrent@yahoo.com
Wed Nov 5 03:55:00 GMT 2014

>On 2014-11-04 18:08, Brent wrote:

>I then reran my complete test suite.  Everything now works except the part of 
>the test where cygwin unzip is to extract a zip file produced by Java.  This 
>particular zip file has entries whose path names are non-ASCII chars.  I have 
>manually verified that this zip file is perfectly extractable by 7zip and 
>WinZip, so Java does not seem to be the problem.

I just realized that there is something I should have mentioned earlier.

As of Java 7, its ZipOutputStream constructor now has the option that you can specify what character encoding is used to for stuff like path names.  See dhams comment here:

I am explicitly using "UTF-8" for the character encoding (tho I did not have to be explicit: UTF-8 is the default).

Could it be that cygwin unzip needs a different character encoding?

That would surprise me, since I thought that the Unix world is coalescing around UTF-8 as the default character encoding.

Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

More information about the Cygwin mailing list