Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Popen on Windows with multithreading - can't delete stdout/stderr logs

Using python 2.7.4 on Windows (Note: WinXP - a commenter below suggests this works correctly on Win7), I have a script that creates several threads each of which runs a child process via Popen with the stdout/stderr redirected to files and calls wait(). Each Popen has its own stdout/stderr files. After each process returns I sometimes have to delete the files (actually move them elsewhere).

I'm finding that I can't delete the stdout/stderr logs until after all the wait() calls return. Prior to that I get "WindowsError: [Error 32] The process cannot access the file because it is being used by another process". It seems that Popen is somehow holding onto the stderr files for as long as there is at least one child process open, even though the files are not shared.

Test code to reproduce below.

C:\test1.py

import subprocess
import threading
import os

def retryDelete(p, idx):
    while True:
        try:
            os.unlink(p)
        except Exception, e:
            if "The process cannot access the file because it is being used by another process" not in e:
                raise e
        else:
            print "Deleted logs", idx
            return

class Test(threading.Thread):
    def __init__(self, idx):
        threading.Thread.__init__(self)
        self.idx = idx

    def run(self):
        print "Creating %d" % self.idx
        stdof = open("stdout%d.log" % self.idx, "w")
        stdef = open("stderr%d.log" % self.idx, "w")
        p = subprocess.Popen("c:\\Python27\\python.exe test2.py %d" % self.idx,
                             stdout=stdof, stderr = stdef)
        print "Waiting %d" % self.idx
        p.wait()
        print "Starting deleting logs %d" % self.idx
        stdof.close()
        stdef.close()
        retryDelete("stderr%d.log" % self.idx, self.idx)
        print "Done %d" % self.idx

threads = [Test(i) for i in range(0, 10)]
for thread in threads:
    thread.start()
for thread in threads:
    thread.join()

c:\test2.py:

import time
import sys

print "Sleeping",sys.argv[1]
time.sleep(int(sys.argv[1]))
print "Exiting",sys.argv[1]

If you run this, you will see that each retryDelete() spins on the file access error until all the child processes have finished.

UPDATE: The issue happens even if the stdof and stdef file descriptors are not passed in to the Popen constructor. However, it does not happen (i.e. the deletes happen immediately) if the Popen is removed and the wait() replaced with time.sleep(self.idx). Since the Popen appears to be having an effect on file descriptors that are not passed to it I wonder if this issue is related to handle inheritance.

UPDATE: close_fds=True gives an error (not supported on Windows when redirecting stdout/stderr), and deleting the Popen object with del p after the wait() call makes no difference to the issue.

UPDATE: Used sysinternals process explorer to look for processes with handles to the file. Reduced the test to just 2 threads/children and made the second one remain open for a long time. Handle search showed that the only process with handles to stderr0.log was the parent python process, which had two handles open to it.

UPDATE: For my current, urgent use, I've found a workaround, which is to create a separate script which takes the command line and stderr/stdout log files as parameters and runs the child process redirected. The parent then just executes this helper script with os.system(). The log files are then freed successfully and are deleted. However, I'm still v.interested in the answer to this question. It feels like a WinXP-specific bug to me, but it's still possible I'm just doing something wrong.

like image 604
Tom Avatar asked Apr 12 '13 08:04

Tom


1 Answers

This issue is old, and this BUG has been fixed on Python 3.4+. For the record, here is a hacky trick we have been using to fix the issue on python 2.7 or python 3.3-

This function is made in pure python (no external APIs), and only works on Windows !

==> Before starting the subprocess, call the following function

def _hack_windows_subprocess():
    """HACK: python 2.7 file descriptors.
    This magic hack fixes https://bugs.python.org/issue19575
    by adding HANDLE_FLAG_INHERIT to all already opened file descriptors.
    """
    # Extracted from https://github.com/secdev/scapy/issues/1136
    import stat
    from ctypes import windll, wintypes
    from msvcrt import get_osfhandle

    HANDLE_FLAG_INHERIT = 0x00000001

    for fd in range(100):
        try:
            s = os.fstat(fd)
        except:
            break
        if stat.S_ISREG(s.st_mode):
            handle = wintypes.HANDLE(get_osfhandle(fd))
            mask   = wintypes.DWORD(HANDLE_FLAG_INHERIT)
            flags  = wintypes.DWORD(0)
            windll.kernel32.SetHandleInformation(handle, mask, flags)

This function will process the last 100 file descriptors that have been opened and set them as "No inheritance mode", which will fix the bug. The 100 number can be increased if needed.

like image 119
Cukic0d Avatar answered Oct 16 '22 06:10

Cukic0d