Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

processing continuous output of a command in python

I'm brand new to python, having used perl for years. A typical thing I do all the time is perl is open a command as a pipe and assign its output to a local variable for processing. In other words:

"open CMD, "$command|";
$output=<CMD>;

a piece of cake. I think I can do something similar in python this way:

args=[command, args...]
process=subprocess.Popen(args, stdout=subprocess.PIPE)
output=process.communicate()

so far so good. Now for the big question...

If I fire off that command using an ssh on multiple platforms, I can then monitor the descriptors in perl inside a select loop to process the results as they come in. I did find the python select and poll modules but am not quite sure how to use them. The documentation says poll would take a file handle, but when I try to pass the variable 'process' above to poll.register() I get an error that it must be an int or have a fileno() method. Since Popen() used stdout, I tried calling

poll.register(process.stdout)

and it no longer throws an error, but instead just hangs.

Any suggestions/pointers of how to make something like this work?

like image 983
Mark J Seger Avatar asked Jan 12 '12 21:01

Mark J Seger


2 Answers

Using select.poll: You need to pass objects with a fileno method or real file descriptors (integers):

import os, sys, select, subprocess

args = ['sh', '-c', 'while true; do date; sleep 2; done']
p1 = subprocess.Popen(args, stdout=subprocess.PIPE)
p2 = subprocess.Popen(args, stdout=subprocess.PIPE)

while True:
    rlist, wlist, xlist = select.select([p1.stdout, p2.stdout], [], [])
    for stdout in rlist:
        sys.stdout.write(os.read(stdout.fileno(), 1024))

You'll see it pause every two seconds and then produce more output as it comes available. The "trick" is that p1.stdout is a normal file-like object with a fileno method that returns the file descriptor number. This is all that is needed by select.

Note that I'm reading from stdout using os.read instead of simply calling stdout.read. This is because a call like stdout.read(1024) would make your program wait until the requested number of bytes have been read. Fewer bytes are only returned when EOF is reached, but since EOF is never reached, the stdout.read call will block until at least 1024 bytes have been read.

This is unlike the os.read function, which has no qualms about returning early when fewer bytes are available — it returns straight away with what's available. In other words, getting less than 1024 bytes back from os.read(stdout.fileno(), 1024) is not a sign that stdout has been closed.

Using select.epoll is almost identical, except that you get a "raw" file descriptor (FD) back that you need os.read to be able to read from:

import os, sys, select, subprocess

args = ['sh', '-c', 'while true; do date; sleep 2; done']
p1 = subprocess.Popen(args, stdout=subprocess.PIPE)
p2 = subprocess.Popen(args, stdout=subprocess.PIPE)

poll = select.poll()
poll.register(p1.stdout)
poll.register(p2.stdout)

while True:
    rlist = poll.poll()
    for fd, event in rlist:
        sys.stdout.write(os.read(fd, 1024))

A closed FD is signaled by the select.POLLHUP event being returned. You can then call the unregister method and finally break out of the loop when all FDs are closed.

Finally, let me note that you could of course make a dictionary with a mapping from file descriptors back to the file-like objects, and hence back to the processes you launched.

like image 200
Martin Geisler Avatar answered Oct 20 '22 14:10

Martin Geisler


import subprocess

p = subprocess.Popen('apt-get autoclean', stdout=subprocess.PIPE, stderr = None, shell=True)

for line in iter(p.stdout.readline, ''):

    print line

p.stdout.flush()
p.stdout.close()

print ("Done")
like image 20
vishal Avatar answered Oct 20 '22 13:10

vishal