Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to inflate a partial zlib file

I have the first contiguous 2/3rds of a file that was compressed with zlib's deflate() function. The last 1/3 was lost in transmission. The original uncompressed file was 600KB.

Deflate was called multiple times by the transmitter while chopping the original file into chunk sizes of 2KB and passing Z_NO_FLUSH until the end of file when Z_FINISH was passed. The resulting complete compressed file was transmitted, but partially lost as described.

Is it possible to recover part of the original file? If so, any suggestions on how?

I'm using both the plain C implementation of ZLIB, and/or the Python 2.7 implementation of ZLIB.

like image 346
JohnSantaFe Avatar asked Dec 16 '13 20:12

JohnSantaFe


2 Answers

Though I don't know python, I managed to get this to work:

#!/usr/bin/python
import sys
import zlib
f = open(sys.argv[1], "rb")
g = open(sys.argv[2], "wb")
z = zlib.decompressobj()
while True:
    buf = z.unconsumed_tail
    if buf == "":
        buf = f.read(8192)
        if buf == "":
            break
    got = z.decompress(buf)
    if got == "":
        break
    g.write(got)

That should extract all that's available from your partial zlib file.

like image 112
Mark Adler Avatar answered Oct 01 '22 09:10

Mark Adler


Update: As @Mark Adler pointed out; partial content can be decompressed using zlib.decompressobj:

>>> decompressor = zlib.decompressobj()
>>> decompressor.decompress(part)
"let's compress some t"

where part is defined below.

--- Old comment follows:

By default zlib doesn't handle partial content in Python.

This works:

>>> compressed = "let's compress some text".encode('zip')
>>> compressed
'x\x9c\xcbI-Q/VH\xce\xcf-(J-.V(\xce\xcfMU(I\xad(\x01\x00pX\t%'
>>> compressed.decode('zip')
"let's compress some text"

It doesn't work if we truncate it:

>>> part = compressed[:3*len(compressed)/4]
>>> part.decode('zip')
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File ".../lib/python2.7/encodings/zlib_codec.py", lin
e 43, in zlib_decode
    output = zlib.decompress(input)
error: Error -5 while decompressing data: incomplete or truncated stream

The same if we use zlib explicitly:

>>> import zlib
>>> zlib.decompress(compressed)
"let's compress some text"
>>> zlib.decompress(part)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
error: Error -5 while decompressing data: incomplete or truncated stream
like image 24
jfs Avatar answered Oct 01 '22 10:10

jfs