Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Faster alternative to Python's zipfile module?

Is there a noticeably faster alternative to Python 2.7.4 zipfile module (with ZIP_DEFLATED) for zipping a large number of files into a single zip file? I had a look at czipfile https://pypi.python.org/pypi/czipfile/1.0.0, but that appears to be focused on faster decrypting (not compressing).

I am routinely having to process a large number of image files (~12,000 files of a combination of .exr and .tiff files) with each file between ~1MB - 6MB in size (and ~9 GB for all the files) into a single zip file for shipment. This zipping takes ~90 minutes to process (running on Windows 7 64bit).

If anyone can recommend a different python module (or alternatively a C/C++ library or even a standalone tool) that would be able to compress a large number of files into a single .zip file in less time than the zipfile module, that would be greatly appreciated (anything close to ~5-10% faster (or more) would be very helpful).

like image 416
michaelhubbard.ca Avatar asked Apr 22 '13 23:04

michaelhubbard.ca


People also ask

Is ZIP file built in Python?

Python also provides a high-level module called zipfile specifically designed to create, read, write, extract, and list the content of ZIP files. In this tutorial, you'll learn about Python's zipfile and how to use it effectively.

Which module can help extract all of the files from a ZIP file in Python?

extractall() method will extract all the contents of the zip file to the current working directory. You can also call extract() method to extract any file by specifying its path in the zip file.

Which statement successfully creates a ZIP file using the ZIP file module in Python?

Create a zip archive from multiple files in Python Create a ZipFile object by passing the new file name and mode as 'w' (write mode). It will create a new zip file and open it within ZipFile object. Call write() function on ZipFile object to add the files in it. call close() on ZipFile object to Close the zip file.


1 Answers

As Patashu mentions, outsourcing to 7-zip might be the best idea.

Here's some sample code to get you started:

import os
import subprocess

path_7zip = r"C:\Program Files\7-Zip\7z.exe"
path_working = r"C:\temp"
outfile_name = "compressed.zip"
os.chdir(path_working)

ret = subprocess.check_output([path_7zip, "a", "-tzip", outfile_name, "*.txt", "*.py", "-pSECRET"])

As martineau mentioned you might experiment with compression methods. This page gives some examples on how to change the command line parameters.

like image 101
bbayles Avatar answered Sep 28 '22 10:09

bbayles