I need to create ZIP archives on demand, using either Python zipfile module or unix command line utilities.
Resources to be zipped are often > 1GB and not necessarily compression-friendly.
How do I efficiently estimate its creation time / size?
When you open a ZIP-file with the archive manager, it tells you the size of the contained files. If you want to know how much all or some contained files are, just mark them (to mark all files: CTRL+A) and take a look at the bar on the bottom.
gzip compression adds about 0.001 seconds to compress, and 0.0003 seconds to decompress (let's round up and say 0.002 total), but you only have to transmit 16kB, which takes 0.0032 seconds. Add them together, transfer with gzip compression is about twice as fast.
Microsoft Windows provides a utility that allows you to zip multiple files into a single compressed file format. This is especially helpful if you are emailing files as attachments or if you need to conserve space (zipping files can reduce file size by up to 50%).
A reason of the extremely slow unzipping on Windows can be Defender that runs in the background and scans each file. This usually happens when you try to unzip a file that was downloaded from an online storage (e.g. from Google Drive) or you received it as an email attachment.
Extract a bunch of small parts from the big file. Maybe 64 chunks of 64k each. Randomly selected.
Concatenate the data, compress it, measure the time and the compression ratio. Since you've randomly selected parts of the file chances are that you have compressed a representative subset of the data.
Now all you have to do is to estimate the time for the whole file based on the time of your test-data.
I suggest you measure the average time it takes to produce a zip of a certain size. Then you calculate the estimate from that measure. However I think the estimate will be very rough in any case if you don't know how well the data compresses. If the data you want to compress had a very similar "profile" each time you could probably make better predictions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With