Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is subprocess.Popen not thread safe?

The following simple script hangs on the subprocess.Popen call intermittently (roughly 30% of the time).
Unless use_lock = True, and then it never hangs, leading me to believe subprocess is not thread safe! The expected behavior is script finishes within 5-6 seconds.
To demonstrate the bug, just run "python bugProof.py" a few times until it hangs. Ctrl-C exits. You'll see the 'post-Popen' appear only once or twice but not the third time.

import subprocess, threading, fcntl, os, time
end_time = time.time()+5
lock = threading.Lock()
use_lock = False
path_to_factorial = os.path.join(os.path.dirname(os.path.realpath(__file__)),'factorial.sh')

def testFunction():
    print threading.current_thread().name, '| pre-Popen'
    if use_lock: lock.acquire()
    p = subprocess.Popen([path_to_factorial], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    if use_lock: lock.release()
    print threading.current_thread().name, '| post-Popen'
    fcntl.fcntl(p.stdout, fcntl.F_SETFL, os.O_NONBLOCK)
    fcntl.fcntl(p.stderr, fcntl.F_SETFL, os.O_NONBLOCK)
    while time.time()<end_time:
        try: p.stdout.read()
        except: pass
        try: p.stderr.read()
        except: pass
    print threading.current_thread().name, '| DONE'

for i in range(3):
    threading.Thread(target=testFunction).start()


The shell script referenced above (factorial.sh):

#!/bin/sh
echo "Calculating factorial (anything that's somewhat compute intensive, this script takes 3 sec on my machine"
ans=1
counter=0
fact=999
while [ $fact -ne $counter ]
do
    counter=`expr $counter + 1`
    ans=`expr $ans \* $counter`
done
echo "Factorial calculation done"
read -p "Test input (this part is critical for bug to occur): " buf
echo "$buf"

System info: Linux 2.6.32-358.123.2.openstack.el6.x86_64 #1 SMP Thu Sep 26 17:14:58 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux
Python 2.7.3 (default, Jan 22 2013, 11:34:30)
[GCC 4.4.6 20120305 (Red Hat 4.4.6-4)] on linux2

like image 275
Roman Avatar asked Jan 17 '14 19:01

Roman


1 Answers

On Python 2.x, there are various race conditions affecting subprocess.Popen. (e.g. on 2.7 it disables & restores garbage collection to prevent various timing issues, but this is not thread-safe in itself). See e.g. http://bugs.python.org/issue2320, http://bugs.python.org/issue1336 and http://bugs.python.org/issue14548 for a few of the issues in this area.

A substantial revision to subprocess was made in Python 3.2 which addresses these (amongst other things, the fork & exec code is in a C module, rather than doing some reasonably involved Python code in the critical part between fork and exec), and is available backported to recent Python 2.x releases in the subprocess32 module. Note the following from the PyPI page: "On POSIX systems it is guaranteed to be reliable when used in threaded applications."

I can reproduce the occasional (about 25% for me) crashes of the code above, but after using import subprocess32 as subprocess I've not seen any failures in 100+ runs.

Note that subprocess32 (and Python 3.2+) default to close_fds=True, but with subprocess32 in place, I saw no failures even with close_fds=False (not that you should generally need that).

like image 131
codedstructure Avatar answered Oct 13 '22 08:10

codedstructure