[PATCH 5/6] Handle bzip2 multi-stream files

Jon TURNEY jon.turney@dronecode.org.uk
Thu Apr 21 10:07:00 GMT 2011

On 20/04/2011 21:16, Charles Wilson wrote:
> On 4/20/2011 12:39 PM, Corinna Vinschen wrote:
>> On Apr  8 15:43, Jon TURNEY wrote:
>>> bzip2 compressed files are allowed to contain multiple streams of
>>> compressed data.  So if we arrive at BZ_STREAM_END but are not at
>>> EOF, restart the decompressor.  Decompress any data which remains
>>> in the stream before reading any more.
>> I take this at face value since I'm not fluent in bzip compression.

I am no expert either, but from 'man bzip2'

"bunzip2 will correctly decompress a file which is the concatenation of two or
more compressed files. The result is the concatenation of the corresponding
uncompressed files."

The technique which pbzip uses to parallelize compression should be obvious :-)

>> So, if that helps to circumvent a potential problem, just go ahead.
>> Does that really occur in some of our tar.bz files?

I think the only bzip2 files which can have this problem are those which have
been generated using pbzip2, hence this bug has been latent until that was
tried, as reported at the beginning of this thread.

> Apparently it happens when pbzip [*] is used (on a cross build
> environment, since cygwin does not actually ship pbzip) to compress the
> tarball.

I suspect that pbzip2 is available in cygwinports :-)

More information about the Cygwin-apps mailing list