I'm running a Perl script through the subprocess module in Python on Linux. The function that runs the script is called several times with variable input. <pre class="prettyprint"><code>def script_runner(variable_input): out_file = open('out_' + variable_input, 'wt') error_file = open('error_' + variable_input, 'wt') process = subprocess.Popen(['perl', 'script', 'options'], shell=False, stdout=out_file, stderr=error_file) </code></pre> However, if I run this function, say, twice, the execution of the first process will stop when the second process starts. I can get my desired behavior by adding <pre class="prettyprint"><code>process.wait() </code></pre> after calling the script, so I'm not really stuck. However, I want find out why I cannot run the script using subprocess as many times as I want, and have the script make these computations in parallel, without having to wait for it to finish between each run. UPDATE The culprit was not so exciting: the perl script used a common file that was rewritten for each execution. However, the lesson I learned from this was that the garbage collector does not delete the process once it starts running, because this had no influence on my script once I got it sorted out.

If you are using Unix, and wish to run many processes in the background, you could use <code>subprocess.Popen</code> this way: x_fork_many.py: <pre class="prettyprint"><code>import subprocess import os import sys import time import random import gc # This is just to test the hypothesis that garbage collection of p=Popen() causing the problem. # This spawns many (3) children in quick succession # and then reports as each child finishes. if __name__=='__main__': N=3 if len(sys.argv)>1: x=random.randint(1,10) print('{p} sleeping for {x} sec'.format(p=os.getpid(),x=x)) time.sleep(x) else: for script in xrange(N): args=['test.py','sleep'] p = subprocess.Popen(args) gc.collect() for i in range(N): pid,retval=os.wait() print('{p} finished'.format(p=pid)) </code></pre> The output looks something like this: <pre class="prettyprint"><code>% x_fork_many.py 15562 sleeping for 10 sec 15563 sleeping for 5 sec 15564 sleeping for 6 sec 15563 finished 15564 finished 15562 finished </code></pre> I'm not sure why you are getting the strange behavior when not calling <code>.wait()</code>. However, the script above suggests (at least on unix) that saving <code>subprocess.Popen(...)</code> processes in a list or set is not necessary. Whatever the problem is, I don't think it has to do with garbage collection. PS. Maybe your perl scripts are conflicting in some way, which causes one to end with an error when another one is running. Have you tried starting multiple calls to the perl script from the command line?

Why do I have to use .wait() with python's subprocess module?

Tags:

python

subprocess

I'm running a Perl script through the subprocess module in Python on Linux. The function that runs the script is called several times with variable input.

def script_runner(variable_input):

    out_file = open('out_' + variable_input, 'wt')
    error_file = open('error_' + variable_input, 'wt')

    process = subprocess.Popen(['perl', 'script', 'options'], shell=False,
                           stdout=out_file, stderr=error_file)

However, if I run this function, say, twice, the execution of the first process will stop when the second process starts. I can get my desired behavior by adding

process.wait()

after calling the script, so I'm not really stuck. However, I want find out why I cannot run the script using subprocess as many times as I want, and have the script make these computations in parallel, without having to wait for it to finish between each run.

UPDATE

The culprit was not so exciting: the perl script used a common file that was rewritten for each execution.

However, the lesson I learned from this was that the garbage collector does not delete the process once it starts running, because this had no influence on my script once I got it sorted out.

705

asked Nov 12 '10 13:11

Viktiglemma

1 Answers

If you are using Unix, and wish to run many processes in the background, you could use subprocess.Popen this way:

x_fork_many.py:

import subprocess
import os
import sys
import time
import random
import gc  # This is just to test the hypothesis that garbage collection of p=Popen() causing the problem.

# This spawns many (3) children in quick succession
# and then reports as each child finishes.
if __name__=='__main__':
    N=3
    if len(sys.argv)>1:
        x=random.randint(1,10)
        print('{p} sleeping for {x} sec'.format(p=os.getpid(),x=x))
        time.sleep(x)
    else:
        for script in xrange(N): 
            args=['test.py','sleep'] 
            p = subprocess.Popen(args)
        gc.collect()
        for i in range(N):
            pid,retval=os.wait()
            print('{p} finished'.format(p=pid))

The output looks something like this:

% x_fork_many.py 
15562 sleeping for 10 sec
15563 sleeping for 5 sec
15564 sleeping for 6 sec
15563 finished
15564 finished
15562 finished

I'm not sure why you are getting the strange behavior when not calling .wait(). However, the script above suggests (at least on unix) that saving subprocess.Popen(...) processes in a list or set is not necessary. Whatever the problem is, I don't think it has to do with garbage collection.

PS. Maybe your perl scripts are conflicting in some way, which causes one to end with an error when another one is running. Have you tried starting multiple calls to the perl script from the command line?

answered Oct 12 '22 20:10

unutbu

Related questions
                            
                                Documentation on writing buildout recipes [closed]
                            
                                PyGTK: how to make a clipboard monitor?
                            
                                Python: convert 2 ints to 32 float
                            
                                How do I securely wipe a file / directory in Python?
                            
                                Python multiprocessing.Queue deadlocks on put and get
                            
                                Dynamic loading of uncompiled python plugins in py2exe compile code
                            
                                How to write a JIT library?
                            
                                Form generation/validation libraries in Python
                            
                                Intelligent date range parsing of human input?
                            
                                How to create a simple mesh in Blender 2.50 via the Python API
                            
                                Retrieving the US Postal Zip code for a street address using Python
                            
                                How to check constraints between elements in a list / is this Constraint Programming?
                            
                                Python: Passing unicode string to C++ module
                            
                                Convert ascii encoding to int and back again in python (quickly)
                            
                                using python.ctypes with cygwin
                            
                                sqlite3 module for Jython
                            
                                mplot3d - How do I display minor ticks?
                            
                                Is TkKinter widely used to build user interfaces?
                            
                                Implement iteritems function for my custom iterator?
                            
                                Relay/Send through NAT in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With