Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merge multiple zip files into a single zip file in Python

Tags:

python

zip

I have multiple zip files that have the same structure -- they contain XML files at the root level. All files in each zip file are unique (no duplicates across the zip files). I need to combine all of the XML files from all of the zip files into a single zip file (with the same structure as the original zip files). Suggestions for how to best go about doing this? Thanks.

like image 722
Dave Crumbacher Avatar asked May 13 '12 00:05

Dave Crumbacher


People also ask

How do I combine multiple ZIP files into one?

Zipping Multiple Files (It will help if you move all the files you wish to zip to a single folder.) Hold down [Ctrl] on your keyboard > Click on each file you wish to combine into a zipped file. Right-click and select "Send To" > Choose "Compressed (Zipped) Folder."

How do I zip 3 files in Python?

To zip multiple files in Python, use the zipfile. ZipFile() method. Iterate all the files that need to be zipped and use the write() method to write the final zipped file.


2 Answers

This is the shortest version I could come up with:

>>> import zipfile as z
>>> z1 = z.ZipFile('z1.zip', 'a')
>>> z2 = z.ZipFile('z2.zip', 'r')
>>> z1.namelist()
['a.xml', 'b.xml']
>>> z2.namelist()
['c.xml', 'd.xml']
>>> [z1.writestr(t[0], t[1].read()) for t in ((n, z2.open(n)) for n in z2.namelist())]
[None, None]
>>> z1.namelist()
['a.xml', 'b.xml', 'c.xml', 'd.xml']
>>> z1.close()

Without testing the alternative, to me this is the best (and probably most obvious too!) solution because - assuming both zip files contains the same amount of data, this method requires the decompression and re-compression of only half of it (1 file).

PS: List comprehension is there just to keep instructions on one line in the console (which speeds debugging up). Good pythonic code would require a proper for loop, given that the resulting list serves no purpose...

HTH!

like image 187
mac Avatar answered Oct 12 '22 14:10

mac


Here's what I came up with, thanks to @mac. Note that the way this is currently implemented the first zip file is modified to contain all the files from the other zip files.

import zipfile as z

zips = ['z1.zip', 'z2.zip', 'z3.zip']

"""
Open the first zip file as append and then read all
subsequent zip files and append to the first one
"""
with z.ZipFile(zips[0], 'a') as z1:
    for fname in zips[1:]:
        zf = z.ZipFile(fname, 'r')
        for n in zf.namelist():
            z1.writestr(n, zf.open(n).read())
like image 41
Dave Crumbacher Avatar answered Oct 12 '22 14:10

Dave Crumbacher