In the Python 2.7 documentation of subprocess module, I found the following snippets:
p1 = Popen(["dmesg"], stdout=PIPE)
p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
output = p2.communicate()[0]
Source : https://docs.python.org/2/library/subprocess.html#replacing-shell-pipeline
I don't understand this line : p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
Here p1.stdout is being closed. How does it allow p1 to receive a SIGPIPE if p2 exits?
The SIGPIPE signal is normally sent if a process attempts to write to a pipe from which no active process is looking. In the shell pipeline equivalent of your code snippet:
`dmesg | grep hda`
If the grep
process for some reason terminates before dmesg
is done writing output, dmesg
will receive a SIGPIPE and terminate itself. This would be the expected behavior for UNIX/Linux processes (http://en.wikipedia.org/wiki/Unix_signal).
By contrast, in the Python implementation using subprocess
, if p2
exits before p1
is done generating output, the SIGPIPE doesn't get sent because there is actually still a process looking at the pipe - the Python script itself (the one which created p1
and p2
). More importantly, the script is looking at the pipe but not consuming its contents - the effect is that the pipe is held open indefinitely and p1
gets stuck in limbo.
Explicitly closing p1.stdout
detaches the Python script from the pipe and makes it such that no process other than p2
is looking at the pipe - that way if p2
does end before p1
, p1
properly gets the signal to end itself without anything artificially holding the pipe open.
Here is an alternatively worded explanation: http://www.enricozini.org/2009/debian/python-pipes/
A hopefully more systematic explanation:
close()
it again! If a process exits, the operating system closes the corresponding file handle for you.close()
their file handle representing the read end of the pipe. Nothing wrong with that, this is a perfectly fine situation.SIGPIPE
signal to the writing process for it to know that there is no reader anymore.This is the standard mechanism by which the receiving program can implicitly tell the sending program that it has stopped reading. Have you ever wondered if
cat bigfile | head -n5
actually reads the entire bigfile? No, it does not, because cat
retrieves a SIGPIPE
signal as soon as head
exits (after reading 5 lines from stdin). The important thing to appreciate: cat
has been designed to actually respond to SIGPIPE
(that is an important engineering decision ;)): it stops reading the file and exits. Other programs are designed to ignore SIGPIPE
(on purpose, these handle this situation on their own -- this is common in networking applications).
If you keep the read end of the pipe open in your controlling process, you disable described mechanism. dmesg
will not be able to notice that grep
has exited.
However, your example actually is not a good one. grep hda
will read the entire input. dmesg
is the process that exits first.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With