Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python tempfile module and threads aren't playing nice; what am I doing wrong?

I'm having an interesting problem with threads and the tempfile module in Python. Something doesn't appear to be getting cleaned up until the threads exit, and I'm running against an open file limit. (This is on OS X 10.5.8, Python 2.5.1.)

Yet if I sort of replicate what the tempfile module is doing (not all the security checks, but just generating a file descriptor and then using os.fdopen to produce a file object) I have no problems.

Before filing this as a bug with Python, I figured I'd check here, as it's much more likely that I'm doing something subtly wrong. But if I am, a day of trying to figure it out hasn't gotten me anywhere.

#!/usr/bin/python

import threading
import thread
import tempfile
import os
import time
import sys

NUM_THREADS = 10000

def worker_tempfile():
    tempfd, tempfn = tempfile.mkstemp()
    tempobj = os.fdopen(tempfd, 'wb')
    tempobj.write('hello, world')
    tempobj.close()
    os.remove(tempfn)
    time.sleep(10)

def worker_notempfile(index):
    tempfn = str(index) + '.txt'
    # The values I'm passing os.open may be different than tempfile.mkstemp 
    # uses, but it works this way as does using the open() function to create
    # a file object directly.
    tempfd = os.open(tempfn, 
                     os.O_EXCL | os.O_CREAT | os.O_TRUNC | os.O_RDWR)
    tempobj = os.fdopen(tempfd, 'wb')
    tempobj.write('hello, world')
    tempobj.close()
    os.remove(tempfn)
    time.sleep(10)

def main():
    for count in range(NUM_THREADS):
        if count % 100 == 0:
            print('Opening thread %s' % count)
        wthread = threading.Thread(target=worker_tempfile)
        #wthread = threading.Thread(target=worker_notempfile, args=(count,))
        started = False
        while not started:
            try:
                wthread.start()
                started = True
            except thread.error:
                print('failed starting thread %s; sleeping' % count)
                time.sleep(3)

if __name__ == '__main__':
    main()

If I run it with the worker_notempfile line active and the worker_tempfile line commented-out, it runs to completion.

The other way around (using worker_tempfile) I get the following error:

$ python threadtempfiletest.py 
Opening thread 0
Opening thread 100
Opening thread 200
Opening thread 300
Exception in thread Thread-301:
Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/threading.py", line 460, in __bootstrap
  File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/threading.py", line 440, in run
  File "threadtempfiletest.py", line 17, in worker_tempfile
  File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/tempfile.py", line 302, in mkstemp
  File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/tempfile.py", line 236, in _mkstemp_inner
OSError: [Errno 24] Too many open files: '/var/folders/4L/4LtD6bCvEoipksvnAcJ2Ok+++Tk/-Tmp-/tmpJ6wjV0'

Any ideas what I'm doing wrong? Is this a bug in Python, or am I being bone-headed?

UPDATE 2009-12-14: I think I've found the answer, but I don't like it. Since nobody was able to replicate the problem, I went hunting around our office for machines. It passed on everything except my machine. I tested on a Mac with the same software versions I was using. I even went hunting for a Desktop G5 with the EXACT same hardware and software config I had -- same result. Both tests (with tempfile and without tempfile) succeeded on everything.

For kicks, I downloaded Python 2.6.4, and tried it on my desktop, and same pattern on my system as Python 2.5.1: tempfile failed, and notempfile succeeded.

This is leading me to the conclusion that something's hosed on my Mac, but I sure can't figure out what. Any suggestions are welcome.

like image 806
Schof Avatar asked Dec 13 '09 02:12

Schof


People also ask

How does Tempfile work in Python?

Tempfile is a Python module used in a situation, where we need to read multiple files, change or access the data in the file, and gives output files based on the result of processed data. Each of the output files produced during the program execution was no longer needed after the program was done.

When should you use Tempfile?

Temporary files, or "tempfiles", are mainly used to store intermediate information on disk for an application. These files are normally created for different purposes such as temporary backup or if the application is dealing with a large dataset bigger than the system's memory, etc.

What is Tempfile Mkstemp?

tempfile. mkdtemp (suffix=None, prefix=None, dir=None) Creates a temporary directory in the most secure manner possible. There are no race conditions in the directory's creation. The directory is readable, writable, and searchable only by the creating user ID.

What is Tempfile Mkstemp return?

According to tempfile. mkstemp docs, mkstemp() returns a tuple containing an OS-level handle to an open file (as would be returned by os. open() ) and the absolute pathname of that file, in that order.


2 Answers

I am unable to reproduce the problem with (Apple's own build of) Python 2.5.1 on Mac OS X 10.5.9 -- runs to completion just fine!

I've tried both on a Macbook Pro, i.e., an Intel processor, and an old PowerMac, i.e., a PPC processor.

So I can only imagine there must have been a bug in 10.5.8 which I never noticed (don't have any 10.5.8 around to test, as I always upgrade promptly whenever software update offers it). All I can suggest is that you try upgrading to 10.5.9 and see if the bug disappears -- if it doesn't, I have no idea how this behavior difference between my machines and yours is possible.

like image 76
Alex Martelli Avatar answered Sep 28 '22 17:09

Alex Martelli


I think your answer can be found here. You have to explicitly os.close() the file descriptor given as the first part of the tuple that mkstemp gives you.

Edit: no, the OP is already doing what is supposed to be done. I'm leaving the answer up for the nice link.

like image 28
Jonathan Feinberg Avatar answered Sep 28 '22 16:09

Jonathan Feinberg