Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get uncompressed size of a .gz file in python

Tags:

python

gzip

Using gzip, tell() returns the offset in the uncompressed file.
In order to show a progress bar, I want to know the original (uncompressed) size of the file.
Is there an easy way to find out?

like image 365
Paul Oyster Avatar asked Nov 09 '09 22:11

Paul Oyster


People also ask

How do I check the size of a gzip file?

You can use the pretty-print and white-space only options to estimate the compression of non-minified content. If you need an estimate: Start with 100 JS files that have gone through the same minification pipeline. For each file, compute the ratio in sizes between gzip -c "$f" | wc -c and wc -c "$f"

Are GZ files compressed?

A GZ file is an archive file compressed by the standard GNU zip (gzip) compression algorithm. It typically contains a single compressed file but may also store multiple compressed files. gzip is primarily used on Unix operating systems for file compression.


1 Answers

Uncompressed size is stored in the last 4 bytes of the gzip file. We can read the binary data and convert it to an int. (This will only work for files under 4GB)

import struct

def getuncompressedsize(filename):
    with open(filename, 'rb') as f:
        f.seek(-4, 2)
        return struct.unpack('I', f.read(4))[0]
like image 190
Brice M. Dempsey Avatar answered Sep 30 '22 19:09

Brice M. Dempsey