Continuing from my previous question I see that to get the error code of a process I spawned via Popen in python I have to call either wait() or communicate() (which can be used to access the Popen stdout and stderr attributes): <pre class="prettyprint"><code>app7z = '/path/to/7z.exe' command = [app7z, 'a', dstFile.temp, "-y", "-r", os.path.join(src.Dir, '*')] process = Popen(command, stdout=PIPE, startupinfo=startupinfo) out = process.stdout regCompressMatch = re.compile('Compressing\s+(.+)').match regErrMatch = re.compile('Error: (.*)').match errorLine = [] for line in out: if len(errorLine) or regErrMatch(line): errorLine.append(line) if regCompressMatch(line): # update a progress bar result = process.wait() # HERE if result: # in the hopes that 7z returns 0 for correct execution dstFile.temp.remove() raise StateError(_("%s: Compression failed:\n%s") % (dstFile.s, "\n".join(errorLine))) </code></pre> However the docs warn that <code>wait()</code> may deadlock (when stdout=PIPE, which is the case here) while <code>communicate()</code> might overflow. So: <ol> <li>what is the proper thing to use here ? Note that I do use the output</li> <li> how exactly should I use communicate ? Would it be: <pre class="prettyprint"><code>process = Popen(command, stdout=PIPE, startupinfo=startupinfo) out = process.communicate()[0] # same as before... result = process.returncode if result: # ... </code></pre> not sure about blocking and the memory errors </li> <li>Any better/more pythonic way of handling the problem ? I do not think that <code>subprocess.CalledProcessError</code> or the <code>subprocess.check_call/check_output</code> apply in my case - or do they ?</li> </ol> DISCLAIMER: I did not write the code, I am the current maintainer, hence question 3. Related: <ul> <li>Python popen command. Wait until the command is finished</li> <li>Check a command's return code when subprocess raises a CalledProcessError exception</li> <li>wait process until all subprocess finish?</li> </ul> I am on windows if this makes a difference - python 2.7.8 There should be one-- and preferably only one --obvious way to do it

<ul> <li>about the deadlock: It is safe to use <code>stdout=PIPE</code> and <code>wait()</code> together iff you read from the pipe. <code>.communicate()</code> does the reading and calls <code>wait()</code> for you</li> <li>about the memory: if the output can be unlimited then you should not use <code>.communicate()</code> that accumulates all output in memory.</li> </ul> <blockquote> what is the proper thing to use here ? </blockquote> To start subprocess, read its output line by line and to wait for it to exit: <pre class="prettyprint"><code>#!/usr/bin/env python from subprocess import Popen, PIPE process = Popen(command, stdout=PIPE, bufsize=1) with process.stdout: for line in iter(process.stdout.readline, b''): handle(line) returncode = process.wait() </code></pre> This code does not deadlock due to a finite OS pipe buffer. Also, the code supports commands with unlimited output (if an individual line fits in memory). <code>iter()</code> is used to read a line as soon as the subprocess' stdout buffer is flushed, to workaround the read-ahead bug in Python 2. You could use a simple <code>for line in process.stdout</code> if you don't need to read lines as soon as they are written without waiting for the buffer to fill or the child process to end. See Python: read streaming input from subprocess.communicate(). If you know that the command output can fit in memory in all cases then you could get the output all at once: <pre class="prettyprint"><code>#!/usr/bin/env python from subprocess import check_output all_output = check_output(command) </code></pre> It raises <code>CalledProcessError</code> if the command returns with a non-zero exit status. Internally, <code>check_output()</code> uses <code>Popen()</code> and <code>.communicate()</code> <blockquote> There should be one-- and preferably only one --obvious way to do it </blockquote> <code>subprocess.Popen()</code> is the main API that works in many many cases. There are convenience functions/methods such as <code>Popen.communicate()</code>, <code>check_output()</code>, <code>check_call()</code> for common use-cases. There are multiple methods, functions because there are multiple different use-cases.

Python Popen - wait vs communicate vs CalledProcessError

Tags:

python

error-handling

python-2.7

popen

Continuing from my previous question I see that to get the error code of a process I spawned via Popen in python I have to call either wait() or communicate() (which can be used to access the Popen stdout and stderr attributes):

app7z = '/path/to/7z.exe'
command = [app7z, 'a', dstFile.temp, "-y", "-r", os.path.join(src.Dir, '*')]
process = Popen(command, stdout=PIPE, startupinfo=startupinfo)
out = process.stdout
regCompressMatch = re.compile('Compressing\s+(.+)').match
regErrMatch = re.compile('Error: (.*)').match
errorLine = []
for line in out:
    if len(errorLine) or regErrMatch(line):
        errorLine.append(line)
    if regCompressMatch(line):
        # update a progress bar
result = process.wait() # HERE
if result: # in the hopes that 7z returns 0 for correct execution
    dstFile.temp.remove()
    raise StateError(_("%s: Compression failed:\n%s") % (dstFile.s, 
                       "\n".join(errorLine)))

However the docs warn that wait() may deadlock (when stdout=PIPE, which is the case here) while communicate() might overflow. So:

what is the proper thing to use here ? Note that I do use the output

how exactly should I use communicate ? Would it be:

process = Popen(command, stdout=PIPE, startupinfo=startupinfo)
out = process.communicate()[0]
# same as before...
result = process.returncode
if result: # ...

not sure about blocking and the memory errors

Any better/more pythonic way of handling the problem ? I do not think that subprocess.CalledProcessError or the subprocess.check_call/check_output apply in my case - or do they ?

DISCLAIMER: I did not write the code, I am the current maintainer, hence question 3.

Python popen command. Wait until the command is finished
Check a command's return code when subprocess raises a CalledProcessError exception
wait process until all subprocess finish?

I am on windows if this makes a difference - python 2.7.8

There should be one-- and preferably only one --obvious way to do it

778

asked Jun 22 '15 14:06

Mr_and_Mrs_D

1 Answers

about the deadlock: It is safe to use stdout=PIPE and wait() together iff you read from the pipe. .communicate() does the reading and calls wait() for you
about the memory: if the output can be unlimited then you should not use .communicate() that accumulates all output in memory.

what is the proper thing to use here ?

To start subprocess, read its output line by line and to wait for it to exit:

#!/usr/bin/env python
from subprocess import Popen, PIPE

process = Popen(command, stdout=PIPE, bufsize=1)
with process.stdout:
    for line in iter(process.stdout.readline, b''): 
        handle(line)
returncode = process.wait()

This code does not deadlock due to a finite OS pipe buffer. Also, the code supports commands with unlimited output (if an individual line fits in memory).

iter() is used to read a line as soon as the subprocess' stdout buffer is flushed, to workaround the read-ahead bug in Python 2. You could use a simple for line in process.stdout if you don't need to read lines as soon as they are written without waiting for the buffer to fill or the child process to end. See Python: read streaming input from subprocess.communicate().

If you know that the command output can fit in memory in all cases then you could get the output all at once:

#!/usr/bin/env python
from subprocess import check_output

all_output = check_output(command)

It raises CalledProcessError if the command returns with a non-zero exit status. Internally, check_output() uses Popen() and .communicate()

There should be one-- and preferably only one --obvious way to do it

subprocess.Popen() is the main API that works in many many cases. There are convenience functions/methods such as Popen.communicate(), check_output(), check_call() for common use-cases.

There are multiple methods, functions because there are multiple different use-cases.

137

answered Oct 28 '22 00:10

jfs

Related questions
                            
                                django rest framework serializers and django forms
                            
                                gaussian sum filter for irregular spaced points
                            
                                "pip install line_profiler" fails
                            
                                Is PythonQt deprecated?
                            
                                Paramiko with continuous stdout
                            
                                retrieve misclassified documents using scikitlearn
                            
                                matplotlib plot and then wait for raw input
                            
                                Get all friends of a given user on twitter with tweepy
                            
                                Django success url using kwargs
                            
                                Python - TypeError: object of type '...' has no len()
                            
                                Write pandas dataframe to xlsm file (Excel with Macros enabled)
                            
                                DataFrame of DataFrames with pandas
                            
                                PyQt - forcing one tab to appear first?
                            
                                Networkx Statistical Inference
                            
                                Pandas cumulative sum on column with condition
                            
                                How can I catch a pandas DataError?
                            
                                Unpack list into middle of a tuple
                            
                                Difference between math.exp(2) and math.e**2 [duplicate]
                            
                                Tkinter, Windows: How to view window in windows task bar which has no title bar?
                            
                                argparse -- requiring either 2 values or none for an optional argument

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With