An internal library that heavily uses subprocess.Popen() started failing its automated tests when we upgraded from Python 2.7.3 to Python 2.7.5. This library is used in a threaded environment. After debugging the issue, I was able to create a short Python script that demonstrates the error seen as in the failing tests.
This is the script (called "threadedsubprocess.py"):
import time
import threading
import subprocess
def subprocesscall():
p = subprocess.Popen(
['ls', '-l'],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
time.sleep(2) # simulate the Popen call takes some time to complete.
out, err = p.communicate()
print 'succeeding command in thread:', threading.current_thread().ident
def failingsubprocesscall():
try:
p = subprocess.Popen(
['thiscommandsurelydoesnotexist'],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
except Exception as e:
print 'failing command:', e, 'in thread:', threading.current_thread().ident
print 'main thread is:', threading.current_thread().ident
subprocesscall_thread = threading.Thread(target=subprocesscall)
subprocesscall_thread.start()
failingsubprocesscall()
subprocesscall_thread.join()
Note: this script does not exit with an IOError when ran from Python 2.7.3. It does fail at least 50% of the times when ran from Python 2.7.5 (both on the same Ubuntu 12.04 64-bit VM).
The error that is raised on Python 2.7.5 is this:
/opt/python/2.7.5/bin/python ./threadedsubprocess.py
main thread is: 139899583563520
failing command: [Errno 2] No such file or directory 139899583563520
Exception in thread Thread-1:
Traceback (most recent call last):
File "/opt/python/2.7.5/lib/python2.7/threading.py", line 808, in __bootstrap_inner
self.run()
File "/opt/python/2.7.5/lib/python2.7/threading.py", line 761, in run
self.__target(*self.__args, **self.__kwargs)
File "./threadedsubprocess.py", line 13, in subprocesscall
out, err = p.communicate()
File "/opt/python/2.7.5/lib/python2.7/subprocess.py", line 806, in communicate
return self._communicate(input)
File "/opt/python/2.7.5/lib/python2.7/subprocess.py", line 1379, in _communicate
self.stdin.close()
IOError: [Errno 9] Bad file descriptor
close failed in file object destructor:
IOError: [Errno 9] Bad file descriptor
When comparing the subprocess module from Python 2.7.3 to Python 2.7.5 I see the Popen()'s __init__() call indeed now explicitly closes the stdin, stdout and stderr file descriptors in case executing the command somehow fails. This seems to be an intended fix applied in Python 2.7.4 to prevent leaking file descriptors (http://hg.python.org/cpython/file/ab05e7dd2788/Misc/NEWS#l629).
The diff between Python 2.7.3 and Python 2.7.5 that seems to be relevant to this issue is in the Popen __init__():
@@ -671,12 +702,33 @@
c2pread, c2pwrite,
errread, errwrite) = self._get_handles(stdin, stdout, stderr)
- self._execute_child(args, executable, preexec_fn, close_fds,
- cwd, env, universal_newlines,
- startupinfo, creationflags, shell,
- p2cread, p2cwrite,
- c2pread, c2pwrite,
- errread, errwrite)
+ try:
+ self._execute_child(args, executable, preexec_fn, close_fds,
+ cwd, env, universal_newlines,
+ startupinfo, creationflags, shell,
+ p2cread, p2cwrite,
+ c2pread, c2pwrite,
+ errread, errwrite)
+ except Exception:
+ # Preserve original exception in case os.close raises.
+ exc_type, exc_value, exc_trace = sys.exc_info()
+
+ to_close = []
+ # Only close the pipes we created.
+ if stdin == PIPE:
+ to_close.extend((p2cread, p2cwrite))
+ if stdout == PIPE:
+ to_close.extend((c2pread, c2pwrite))
+ if stderr == PIPE:
+ to_close.extend((errread, errwrite))
+
+ for fd in to_close:
+ try:
+ os.close(fd)
+ except EnvironmentError:
+ pass
+
+ raise exc_type, exc_value, exc_trace
I think I have three questions:
1) Is it true it should principally possible to use subprocess.Popen, with PIPE for stdin, stdout and stderr, in a threaded environment?
2) How do I prevent the file descriptors for stdin, stdout and stderr to be closed when the Popen() fails in one of the threads?
3) Am I doing something wrong here?
Popen do we need to close the connection or subprocess automatically closes the connection? Usually, the examples in the official documentation are complete. There the connection is not closed. So you do not need to close most probably.
. communicate() writes input (there is no input in this case so it just closes subprocess' stdin to indicate to the subprocess that there is no more input), reads all output, and waits for the subprocess to exit.
The subprocess module defines one class, Popen and a few wrapper functions that use that class. The constructor for Popen takes arguments to set up the new process so the parent can communicate with it via pipes. It provides all of the functionality of the other modules and functions it replaces, and more.
Python method popen() opens a pipe to or from command. The return value is an open file object connected to the pipe, which can be read or written depending on whether mode is 'r' (default) or 'w'. The bufsize argument has the same meaning as in open() function.
I would like to answer your questions with:
The error occurs indeed in Python 2.7.4 as well.
I think this is a bug in the library code. If you add a lock in your program and make sure that the two calls to subprocess.Popen
are executed atomically, the error does not occur.
@@ -1,32 +1,40 @@
import time
import threading
import subprocess
+lock = threading.Lock()
+
def subprocesscall():
+ lock.acquire()
p = subprocess.Popen(
['ls', '-l'],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
+ lock.release()
time.sleep(2) # simulate the Popen call takes some time to complete.
out, err = p.communicate()
print 'succeeding command in thread:', threading.current_thread().ident
def failingsubprocesscall():
try:
+ lock.acquire()
p = subprocess.Popen(
['thiscommandsurelydoesnotexist'],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
except Exception as e:
print 'failing command:', e, 'in thread:', threading.current_thread().ident
+ finally:
+ lock.release()
+
print 'main thread is:', threading.current_thread().ident
subprocesscall_thread = threading.Thread(target=subprocesscall)
subprocesscall_thread.start()
failingsubprocesscall()
subprocesscall_thread.join()
This means that it is most probably due to some data race in the implementation of Popen
. I will risk a guess: the bug may be in the implementation of pipe_cloexec
, called by _get_handles
, which (in 2.7.4) is:
def pipe_cloexec(self):
"""Create a pipe with FDs set CLOEXEC."""
# Pipes' FDs are set CLOEXEC by default because we don't want them
# to be inherited by other subprocesses: the CLOEXEC flag is removed
# from the child's FDs by _dup2(), between fork() and exec().
# This is not atomic: we would need the pipe2() syscall for that.
r, w = os.pipe()
self._set_cloexec_flag(r)
self._set_cloexec_flag(w)
return r, w
and the comment warns explicitly about it not being atomic... This definitely causes a data race but, without experimentation, I don't know if it's what causes the problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With