To decompress a file, use the same the GZipStream class. Set the following parameters: source file and the name of the output file. From the source zip file, open a GZipStream. To decompress, use a loop and read as long as you have data in the stream.
The zlib library provides Deflate compression and decompression code for use by zip, gzip, png (which uses the zlib wrapper on deflate data), and many other applications.
With the help of zlib. decompress(s) method, we can decompress the compressed bytes of string into original string by using zlib.
To decompress a gzip format file with zlib, call inflateInit2 with the windowBits parameter as 16+MAX_WBITS, like this:
inflateInit2(&stream, 16+MAX_WBITS);
If you don't do this, zlib will complain about a bad stream format. By default, zlib creates streams with a zlib header, and on inflate does not recognise the different gzip header unless you tell it so. Although this is documented starting in version 1.2.1 of the zlib.h header file, it is not in the zlib manual. From the header file:
windowBitscan also be greater than 15 for optional gzip decoding. Add 32 towindowBitsto enable zlib and gzip decoding with automatic header detection, or add 16 to decode only the gzip format (the zlib format will return aZ_DATA_ERROR). If a gzip stream is being decoded,strm->adleris a crc32 instead of an adler32.
zlib library supports:
zlib compressed format)deflate compressed format)gzip compressed format)The python zlib module will support these as well.
But zlib can decompress all those formats:
deflate format, use wbits = -zlib.MAX_WBITS
zlib format, use wbits = zlib.MAX_WBITS
gzip format, use wbits = zlib.MAX_WBITS | 16
See documentation in http://www.zlib.net/manual.html#Advanced (section inflateInit2)
test data:
>>> deflate_compress = zlib.compressobj(9, zlib.DEFLATED, -zlib.MAX_WBITS)
>>> zlib_compress = zlib.compressobj(9, zlib.DEFLATED, zlib.MAX_WBITS)
>>> gzip_compress = zlib.compressobj(9, zlib.DEFLATED, zlib.MAX_WBITS | 16)
>>>
>>> text = '''test'''
>>> deflate_data = deflate_compress.compress(text) + deflate_compress.flush()
>>> zlib_data = zlib_compress.compress(text) + zlib_compress.flush()
>>> gzip_data = gzip_compress.compress(text) + gzip_compress.flush()
>>>
obvious test for zlib:
>>> zlib.decompress(zlib_data)
'test'
test for deflate:
>>> zlib.decompress(deflate_data)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
zlib.error: Error -3 while decompressing data: incorrect header check
>>> zlib.decompress(deflate_data, -zlib.MAX_WBITS)
'test'
test for gzip:
>>> zlib.decompress(gzip_data)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
zlib.error: Error -3 while decompressing data: incorrect header check
>>> zlib.decompress(gzip_data, zlib.MAX_WBITS|16)
'test'
the data is also compatible with gzip module:
>>> import gzip
>>> import StringIO
>>> fio = StringIO.StringIO(gzip_data)
>>> f = gzip.GzipFile(fileobj=fio)
>>> f.read()
'test'
>>> f.close()
adding 32 to windowBits will trigger header detection
>>> zlib.decompress(gzip_data, zlib.MAX_WBITS|32)
'test'
>>> zlib.decompress(zlib_data, zlib.MAX_WBITS|32)
'test'
gzip insteadFor gzip data with gzip header you can use gzip module directly; but please remember that under the hood, gzip uses zlib.
fh = gzip.open('abc.gz', 'rb')
cdata = fh.read()
fh.close()
The structure of zlib and gzip is different. zlib uses RFC 1950 and gzip uses RFC 1952, so have different headers but the rest have the same structure and follows the RFC 1951.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With