In a django project, I need to generate some pdf files for objects in db. Since each file takes a few seconds to generate, I use celery to run tasks asynchronously.
Problem is, I need to add each file to a zip archive. I was planning to use the python zipfile module, but different tasks can be run in different threads, and I wonder what will happen if two tasks try to add a file to the archive at the same time.
Is the following code thread safe or not? I cannot find any valuable information in the python's official doc.
try:
zippath = os.path.join(pdf_directory, 'archive.zip')
zipfile = ZipFile(zippath, 'a')
zipfile.write(pdf_fullname)
finally:
zipfile.close()
Note: this is running under python 2.6
Python is not by its self thread safe. But there are moves to change this: NoGil, etc. Removing the GIL does not make functions thread-safe.
Python's zipfile is a standard library module intended to manipulate ZIP files. This file format is a widely adopted industry standard when it comes to archiving and compressing digital data. You can use it to package together several related files.
Python can work directly with data in ZIP files. You can look at the list of items in the directory and work with the data files themselves.
Import the zipfile module Create a zip file object using ZipFile class. Call the extract() method on the zip file object and pass the name of the file to be extracted and the path where the file needed to be extracted and Extracting the specific file present in the zip.
No, it is not thread-safe in that sense.
If you're appending to the same zip file, you'd need a lock there, or the file contents could get scrambled.
If you're appending to different zip files, using separate ZipFile()
objects, then you're fine.
Python 3.5.5 makes writing to ZipFile and reading multiple ZipExtFiles threadsafe: https://docs.python.org/3.5/whatsnew/changelog.html#id93
As far as I can tell, the change has not been backported to Python 2.7.
Update: after studying the code and some testing, it becomes apparent that the locking is still not thoroughly implemented. It correctly works only for writestr
and doesn't work for open
and write
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With