I'm having an interesting problem with threads and the tempfile module in Python. Something doesn't appear to be getting cleaned up until the threads exit, and I'm running against an open file limit. (This is on OS X 10.5.8, Python 2.5.1.)
Yet if I sort of replicate what the tempfile module is doing (not all the security checks, but just generating a file descriptor and then using os.fdopen to produce a file object) I have no problems.
Before filing this as a bug with Python, I figured I'd check here, as it's much more likely that I'm doing something subtly wrong. But if I am, a day of trying to figure it out hasn't gotten me anywhere.
#!/usr/bin/python
import threading
import thread
import tempfile
import os
import time
import sys
NUM_THREADS = 10000
def worker_tempfile():
tempfd, tempfn = tempfile.mkstemp()
tempobj = os.fdopen(tempfd, 'wb')
tempobj.write('hello, world')
tempobj.close()
os.remove(tempfn)
time.sleep(10)
def worker_notempfile(index):
tempfn = str(index) + '.txt'
# The values I'm passing os.open may be different than tempfile.mkstemp
# uses, but it works this way as does using the open() function to create
# a file object directly.
tempfd = os.open(tempfn,
os.O_EXCL | os.O_CREAT | os.O_TRUNC | os.O_RDWR)
tempobj = os.fdopen(tempfd, 'wb')
tempobj.write('hello, world')
tempobj.close()
os.remove(tempfn)
time.sleep(10)
def main():
for count in range(NUM_THREADS):
if count % 100 == 0:
print('Opening thread %s' % count)
wthread = threading.Thread(target=worker_tempfile)
#wthread = threading.Thread(target=worker_notempfile, args=(count,))
started = False
while not started:
try:
wthread.start()
started = True
except thread.error:
print('failed starting thread %s; sleeping' % count)
time.sleep(3)
if __name__ == '__main__':
main()
If I run it with the worker_notempfile
line active and the worker_tempfile
line commented-out, it runs to completion.
The other way around (using worker_tempfile
) I get the following error:
$ python threadtempfiletest.py
Opening thread 0
Opening thread 100
Opening thread 200
Opening thread 300
Exception in thread Thread-301:
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/threading.py", line 460, in __bootstrap
File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/threading.py", line 440, in run
File "threadtempfiletest.py", line 17, in worker_tempfile
File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/tempfile.py", line 302, in mkstemp
File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/tempfile.py", line 236, in _mkstemp_inner
OSError: [Errno 24] Too many open files: '/var/folders/4L/4LtD6bCvEoipksvnAcJ2Ok+++Tk/-Tmp-/tmpJ6wjV0'
Any ideas what I'm doing wrong? Is this a bug in Python, or am I being bone-headed?
UPDATE 2009-12-14: I think I've found the answer, but I don't like it. Since nobody was able to replicate the problem, I went hunting around our office for machines. It passed on everything except my machine. I tested on a Mac with the same software versions I was using. I even went hunting for a Desktop G5 with the EXACT same hardware and software config I had -- same result. Both tests (with tempfile and without tempfile) succeeded on everything.
For kicks, I downloaded Python 2.6.4, and tried it on my desktop, and same pattern on my system as Python 2.5.1: tempfile failed, and notempfile succeeded.
This is leading me to the conclusion that something's hosed on my Mac, but I sure can't figure out what. Any suggestions are welcome.
Tempfile is a Python module used in a situation, where we need to read multiple files, change or access the data in the file, and gives output files based on the result of processed data. Each of the output files produced during the program execution was no longer needed after the program was done.
Temporary files, or "tempfiles", are mainly used to store intermediate information on disk for an application. These files are normally created for different purposes such as temporary backup or if the application is dealing with a large dataset bigger than the system's memory, etc.
tempfile. mkdtemp (suffix=None, prefix=None, dir=None) Creates a temporary directory in the most secure manner possible. There are no race conditions in the directory's creation. The directory is readable, writable, and searchable only by the creating user ID.
According to tempfile. mkstemp docs, mkstemp() returns a tuple containing an OS-level handle to an open file (as would be returned by os. open() ) and the absolute pathname of that file, in that order.
I am unable to reproduce the problem with (Apple's own build of) Python 2.5.1 on Mac OS X 10.5.9 -- runs to completion just fine!
I've tried both on a Macbook Pro, i.e., an Intel processor, and an old PowerMac, i.e., a PPC processor.
So I can only imagine there must have been a bug in 10.5.8 which I never noticed (don't have any 10.5.8 around to test, as I always upgrade promptly whenever software update offers it). All I can suggest is that you try upgrading to 10.5.9 and see if the bug disappears -- if it doesn't, I have no idea how this behavior difference between my machines and yours is possible.
I think your answer can be found here. You have to explicitly os.close()
the file descriptor given as the first part of the tuple that mkstemp
gives you.
Edit: no, the OP is already doing what is supposed to be done. I'm leaving the answer up for the nice link.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With