Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python subprocess running out of file descriptors

I've got a long running python project that uses the subprocess module to start various other programs. It waits for each program to finish, then ends the wrapper function and returns to its wait loop.

Eventually, this brings the computer it's running on to a grinding halt, with the error that there is no more file descriptors available.

I'm not able to find anywhere in the subprocess docs what happens to file descriptors when a child process closes. At first, I thought they would close automatically, since the subprocess.call() command waits until the child terminates.

But if that were the case I wouldn't have a problem. I also thought that if there was anything left over, python would garbage collect it when the function finishes, and the file descriptors go out of scope. But this doesn't seem to be the case either.

How would I get access to these file descriptors? the subprocess.call() function only returns the exit code, not open file descriptors. Is there something else I'm missing here?

This project acts as glue between various enterprise apps. Said apps cannot be pipelined, and they are gui systems. So, the only thing I can do is start them off with their built in macros. These macros output text files, which I use for the next program in the pipe.

Yes, it is as bad as it sounds. Luckily, All the files end up having pretty unique names. So, here in the next few days I'll be using the sys internals tool suggested below to try and track down the file. I'll let you know how it turns out.

Most of the files I don't open, I just move them with the win32file.CopyFile() function.

like image 393
Spencer Rathbun Avatar asked Jul 12 '11 19:07

Spencer Rathbun


2 Answers

I have had the same issue.

We constantly use subprocess.Popen() to invoke external tools in a Windows environment. At some point, we had an issue where no more file descriptors were available. We drilled down to the issue and discovered that subprocess.Popen instances behave differently in Windows than in Linux.

If the Popen instance is not destroyed (e.g. by keeping a reference somehow, and thus not allowing the garbage collector to destroy the object), the pipes that were created during the call remain opened in Windows, while in Linux they were automatically closed after Popen.communicate() was called. If this is continued in further calls, the "zombie" file descriptors from the pipes will pile up, and eventually cause a Python exception IOError: [Errno 24] Too many open files.

How to Get Opened File Descriptors in Python

In order for us to troubleshoot our issues, we needed a way to get the valid file descriptors in a Python script. So, we crafted the following script. Note that we only check file descriptors from 0 to 100, since we do not open so many files concurrently.

fd_table_status.py :

import os
import stat

_fd_types = (
    ('REG', stat.S_ISREG),
    ('FIFO', stat.S_ISFIFO),
    ('DIR', stat.S_ISDIR),
    ('CHR', stat.S_ISCHR),
    ('BLK', stat.S_ISBLK),
    ('LNK', stat.S_ISLNK),
    ('SOCK', stat.S_ISSOCK)
)

def fd_table_status():
    result = []
    for fd in range(100):
        try:
            s = os.fstat(fd)
        except:
            continue
        for fd_type, func in _fd_types:
            if func(s.st_mode):
                break
        else:
            fd_type = str(s.st_mode)
        result.append((fd, fd_type))
    return result

def fd_table_status_logify(fd_table_result):
    return ('Open file handles: ' +
            ', '.join(['{0}: {1}'.format(*i) for i in fd_table_result]))

def fd_table_status_str():
    return fd_table_status_logify(fd_table_status())

if __name__=='__main__':
    print fd_table_status_str()

When simply run, it will show all open file descriptors and their respective type:

$> python fd_table_status.py
Open file handles: 0: CHR, 1: CHR, 2: CHR
$>

The output is the same by calling fd_table_status_str() through Python code. For details on the "CHR" and respecting "short-codes" meaning, see Python documentation on stat.

Testing file descriptor behavior

Try running the following script in Linux and Windows:

test_fd_handling.py :

import fd_table_status
import subprocess
import platform

fds = fd_table_status.fd_table_status_str

if platform.system()=='Windows':
    python_exe = r'C:\Python27\python.exe'
else:
    python_exe = 'python'

print '1) Initial file descriptors:\n' + fds()
f = open('fd_table_status.py', 'r')
print '2) After file open, before Popen:\n' + fds()
p = subprocess.Popen(['python', 'fd_table_status.py'],
                     stdin=subprocess.PIPE,
                     stdout=subprocess.PIPE,
                     stderr=subprocess.PIPE)
print '3) After Popen, before reading piped output:\n' + fds()
result = p.communicate()
print '4) After Popen.communicate():\n' + fds()
del p
print '5) After deleting reference to Popen instance:\n' + fds()
del f
print '6) After deleting reference to file instance:\n' + fds()
print '7) child process had the following file descriptors:'
print result[0][:-1]

Linux output

1) Initial file descriptors:
Open file handles: 0: CHR, 1: CHR, 2: CHR
2) After file open, before Popen:
Open file handles: 0: CHR, 1: CHR, 2: CHR, 3: REG
3) After Popen, before reading piped output:
Open file handles: 0: CHR, 1: CHR, 2: CHR, 3: REG, 5: FIFO, 6: FIFO, 8: FIFO
4) After Popen.communicate():
Open file handles: 0: CHR, 1: CHR, 2: CHR, 3: REG
5) After deleting reference to Popen instance:
Open file handles: 0: CHR, 1: CHR, 2: CHR, 3: REG
6) After deleting reference to file instance:
Open file handles: 0: CHR, 1: CHR, 2: CHR
7) child process had the following file descriptors:
Open file handles: 0: FIFO, 1: FIFO, 2: FIFO, 3: REG

Windows output

1) Initial file descriptors:
Open file handles: 0: CHR, 1: CHR, 2: CHR
2) After file open, before Popen:
Open file handles: 0: CHR, 1: CHR, 2: CHR, 3: REG
3) After Popen, before reading piped output:
Open file handles: 0: CHR, 1: CHR, 2: CHR, 3: REG, 4: FIFO, 5: FIFO, 6: FIFO
4) After Popen.communicate():
Open file handles: 0: CHR, 1: CHR, 2: CHR, 3: REG, 5: FIFO, 6: FIFO
5) After deleting reference to Popen instance:
Open file handles: 0: CHR, 1: CHR, 2: CHR, 3: REG
6) After deleting reference to file instance:
Open file handles: 0: CHR, 1: CHR, 2: CHR
7) child process had the following file descriptors:
Open file handles: 0: FIFO, 1: FIFO, 2: FIFO

As you can see in step 4, Windows do not behave the same as Linux. The Popen instance must be destroyed for the pipes to be closed.

Btw, the difference in step 7 shows a different issue concerning behavior of the Python interpreter in Windows, you can see more details on both issues here.

like image 174
mihalop Avatar answered Sep 27 '22 19:09

mihalop


What python version are you using? There is a known leak of file descriptors with subprocess.Popen() that might also affect subprocess.call()

http://bugs.python.org/issue6274

As you can see, this was only fixed in python-2.6

like image 31
Mario Avatar answered Sep 27 '22 18:09

Mario