I have a thread in which I am reading a zip file with zipfile.ZipFile().read()
, where I am getting a memory error.
I am aware that read()
loads the entire file into memory. The size of file after unzipping is more than 100MB. I also tried with zipfile.ZipFile().open().readlines()
, but it takes too much time.
Is there any way that I can read the file with speed without getting memory error?
If you get an unexpected MemoryError and you think you should have plenty of RAM available, it might be because you are using a 32-bit python installation. The easy solution, if you have a 64-bit operating system, is to switch to a 64-bit installation of python.
A MemoryError means that the interpreter has run out of memory to allocate to your Python program. This may be due to an issue in the setup of the Python environment or it may be a concern with the code itself loading too much data at the same time.
Python can work directly with data in ZIP files. You can look at the list of items in the directory and work with the data files themselves. This recipe is a snippet that lists all of the names and content lengths of the files included in the ZIP archive zipfile. zip .
Assuming you're trying to read a zipped text file, you can treat the file-like object returned by ZipFile.open()
as an iterator, and process it line-by-line...
from zipfile import ZipFile
zip = ZipFile('myzip.zip')
stream = zip.open('myfile.txt')
for line in stream:
do_something_with(line)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With