This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: bug/deficiency in unzip: incompatible with other programs when entry path names have non-ascii chars


>On 2014-11-04 18:08, Brent wrote:

> 
>I then reran my complete test suite.  Everything now works except the part of 
>the test where cygwin unzip is to extract a zip file produced by Java.  This 
>particular zip file has entries whose path names are non-ASCII chars.  I have 
>manually verified that this zip file is perfectly extractable by 7zip and 
>WinZip, so Java does not seem to be the problem.


I just realized that there is something I should have mentioned earlier.

As of Java 7, its ZipOutputStream constructor now has the option that you can specify what character encoding is used to for stuff like path names.  See dhams comment here:
    https://stackoverflow.com/questions/9974779/using-unicode-characters-for-file-names-inside-a-zip-archive

I am explicitly using "UTF-8" for the character encoding (tho I did not have to be explicit: UTF-8 is the default).

Could it be that cygwin unzip needs a different character encoding?

That would surprise me, since I thought that the Unix world is coalescing around UTF-8 as the default character encoding.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]