Could you please help me write a function returning:
dict("file1.txt": list(<contents of file1>),
"file2.txt": list(<contents of file2>),
"file3.txt": list(<contents of file3>),
"file4.txt": list(<contents of file4>))
On input:
file.zip:
outer\
outer\inner1.zip:
file1.txt
file2.txt
outer\inner2.zip:
file3.txt
file4.txt
My attempts (with exceptions below):
http://ideone.com/s1tyb
WindowsError: [Error 32] The process cannot access the file because it is being used by another process
http://ideone.com/Y2oTw
"File is not a zip file"
http://ideone.com/0HoGa
"File is not a zip file"
http://ideone.com/owmdK
AttributeError: ZipFile instance has no attribute 'seek'
-r Option: To zip a directory recursively, use the -r option with the zip command and it will recursively zips the files in a directory.
To unzip a file in Python, use ZipFile. The extractall() method takes a path, members, pwd as an argument and extracts all the contents.
Unzipping with the zipfile module As you did for zipping, for unzipping you first have to create an object of ZipFile class. However unlike zipping, for unzipping the first parameter is the path to the zipped file and the second parameter is the file permission which should be “r” (reading) in case of unzipping.
extractall() method will extract all the contents of the zip file to the current working directory. You can also call extract() method to extract any file by specifying its path in the zip file. This will extract only the specified file.
Finally worked it out... with a bit of help from: Extracting a zipfile to memory?;
from zipfile import ZipFile, is_zipfile
def extract_zip(input_zip):
input_zip=ZipFile(input_zip)
return {name: input_zip.read(name) for name in input_zip.namelist()}
def extract_all(input_zip):
return {entry: extract_zip(entry) for entry in ZipFile(input_zip).namelist() if is_zipfile(entry)}
Modified your code (You should close ZipFile before deleting it + added extraction of inner zip files):
import os
import shutil
import tempfile
from zipfile import ZipFile
def unzip_recursively(parent_archive):
parent_archive = ZipFile(parent_archive)
result = {}
tmpdir = tempfile.mkdtemp()
try:
parent_archive.extractall(path=tmpdir)
namelist=parent_archive.namelist()
for name in namelist[1:]:
innerzippath = os.path.join(tmpdir, name)
inner_zip = ZipFile(innerzippath)
inner_extract_path = innerzippath+'.content'
if not os.path.exists(inner_extract_path):
os.makedirs(inner_extract_path)
inner_zip.extractall(path=inner_extract_path)
for inner_file_name in inner_zip.namelist():
result[inner_file_name] = open(os.path.join(inner_extract_path, inner_file_name)).read()
inner_zip.close()
finally:
shutil.rmtree(tmpdir)
return result
if __name__ == '__main__':
print unzip_recursively('file.zip')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With