Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Run command and get its stdout, stderr separately in near real time like in a terminal

Tags:

I am trying to find a way in Python to run other programs in such a way that:

  1. The stdout and stderr of the program being run can be logged separately.
  2. The stdout and stderr of the program being run can be viewed in near-real time, such that if the child process hangs, the user can see. (i.e. we do not wait for execution to complete before printing the stdout/stderr to the user)
  3. Bonus criteria: The program being run does not know it is being run via python, and thus will not do unexpected things (like chunk its output instead of printing it in real-time, or exit because it demands a terminal to view its output). This small criteria pretty much means we will need to use a pty I think.

Here is what i've got so far... Method 1:

def method1(command):     ## subprocess.communicate() will give us the stdout and stderr sepurately,      ## but we will have to wait until the end of command execution to print anything.     ## This means if the child process hangs, we will never know....     proc=subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True, executable='/bin/bash')     stdout, stderr = proc.communicate() # record both, but no way to print stdout/stderr in real-time     print ' ######### REAL-TIME ######### '     ########         Not Possible     print ' ########## RESULTS ########## '     print 'STDOUT:'     print stdout     print 'STDOUT:'     print stderr 

Method 2

def method2(command):     ## Using pexpect to run our command in a pty, we can see the child's stdout in real-time,     ## however we cannot see the stderr from "curl google.com", presumably because it is not connected to a pty?     ## Furthermore, I do not know how to log it beyond writing out to a file (p.logfile). I need the stdout and stderr     ## as strings, not files on disk! On the upside, pexpect would give alot of extra functionality (if it worked!)     proc = pexpect.spawn('/bin/bash', ['-c', command])     print ' ######### REAL-TIME ######### '     proc.interact()     print ' ########## RESULTS ########## '     ########         Not Possible 

Method 3:

def method3(command):     ## This method is very much like method1, and would work exactly as desired     ## if only proc.xxx.read(1) wouldn't block waiting for something. Which it does. So this is useless.     proc=subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True, executable='/bin/bash')     print ' ######### REAL-TIME ######### '     out,err,outbuf,errbuf = '','','',''     firstToSpeak = None     while proc.poll() == None:             stdout = proc.stdout.read(1) # blocks             stderr = proc.stderr.read(1) # also blocks             if firstToSpeak == None:                 if stdout != '': firstToSpeak = 'stdout'; outbuf,errbuf = stdout,stderr                 elif stderr != '': firstToSpeak = 'stderr'; outbuf,errbuf = stdout,stderr             else:                 if (stdout != '') or (stderr != ''): outbuf += stdout; errbuf += stderr                 else:                     out += outbuf; err += errbuf;                     if firstToSpeak == 'stdout': sys.stdout.write(outbuf+errbuf);sys.stdout.flush()                     else: sys.stdout.write(errbuf+outbuf);sys.stdout.flush()                     firstToSpeak = None     print ''     print ' ########## RESULTS ########## '     print 'STDOUT:'     print out     print 'STDERR:'     print err 

To try these methods out, you will need to import sys,subprocess,pexpect

pexpect is pure-python and can be had with

sudo pip install pexpect

I think the solution will involve python's pty module - which is somewhat of a black art that I cannot find anyone who knows how to use. Perhaps SO knows :) As a heads-up, i recommend you use 'curl www.google.com' as a test command, because it prints its status out on stderr for some reason :D


UPDATE-1:
OK so the pty library is not fit for human consumption. The docs, essentially, are the source code. Any presented solution that is blocking and not async is not going to work here. The Threads/Queue method by Padraic Cunningham works great, although adding pty support is not possible - and it's 'dirty' (to quote Freenode's #python). It seems like the only solution fit for production-standard code is using the Twisted framework, which even supports pty as a boolean switch to run processes exactly as if they were invoked from the shell. But adding Twisted into a project requires a total rewrite of all the code. This is a total bummer :/

UPDATE-2:

Two answers were provided, one of which addresses the first two criteria and will work well where you just need both the stdout and stderr using Threads and Queue. The other answer uses select, a non-blocking method for reading file descriptors, and pty, a method to "trick" the spawned process into believing it is running in a real terminal just as if it was run from Bash directly - but may or may not have side-effects. I wish I could accept both answers, because the "correct" method really depends on the situation and why you are subprocessing in the first place, but alas, I could only accept one.

like image 254
J.J Avatar asked Aug 10 '15 18:08

J.J


2 Answers

The stdout and stderr of the program being run can be logged separately.

You can't use pexpect because both stdout and stderr go to the same pty and there is no way to separate them after that.

The stdout and stderr of the program being run can be viewed in near-real time, such that if the child process hangs, the user can see. (i.e. we do not wait for execution to complete before printing the stdout/stderr to the user)

If the output of a subprocess is not a tty then it is likely that it uses a block buffering and therefore if it doesn't produce much output then it won't be "real time" e.g., if the buffer is 4K then your parent Python process won't see anything until the child process prints 4K chars and the buffer overflows or it is flushed explicitly (inside the subprocess). This buffer is inside the child process and there are no standard ways to manage it from outside. Here's picture that shows stdio buffers and the pipe buffer for command 1 | command2 shell pipeline:

pipe/stdio buffers

The program being run does not know it is being run via python, and thus will not do unexpected things (like chunk its output instead of printing it in real-time, or exit because it demands a terminal to view its output).

It seems, you meant the opposite i.e., it is likely that your child process chunks its output instead of flushing each output line as soon as possible if the output is redirected to a pipe (when you use stdout=PIPE in Python). It means that the default threading or asyncio solutions won't work as is in your case.

There are several options to workaround it:

  • the command may accept a command-line argument such as grep --line-buffered or python -u, to disable block buffering.

  • stdbuf works for some programs i.e., you could run ['stdbuf', '-oL', '-eL'] + command using the threading or asyncio solution above and you should get stdout, stderr separately and lines should appear in near-real time:

    #!/usr/bin/env python3 import os import sys from select import select from subprocess import Popen, PIPE  with Popen(['stdbuf', '-oL', '-e0', 'curl', 'www.google.com'],            stdout=PIPE, stderr=PIPE) as p:     readable = {         p.stdout.fileno(): sys.stdout.buffer, # log separately         p.stderr.fileno(): sys.stderr.buffer,     }     while readable:         for fd in select(readable, [], [])[0]:             data = os.read(fd, 1024) # read available             if not data: # EOF                 del readable[fd]             else:                  readable[fd].write(data)                 readable[fd].flush() 
  • finally, you could try pty + select solution with two ptys:

    #!/usr/bin/env python3 import errno import os import pty import sys from select import select from subprocess import Popen  masters, slaves = zip(pty.openpty(), pty.openpty()) with Popen([sys.executable, '-c', r'''import sys, time print('stdout', 1) # no explicit flush time.sleep(.5) print('stderr', 2, file=sys.stderr) time.sleep(.5) print('stdout', 3) time.sleep(.5) print('stderr', 4, file=sys.stderr) '''],            stdin=slaves[0], stdout=slaves[0], stderr=slaves[1]):     for fd in slaves:         os.close(fd) # no input     readable = {         masters[0]: sys.stdout.buffer, # log separately         masters[1]: sys.stderr.buffer,     }     while readable:         for fd in select(readable, [], [])[0]:             try:                 data = os.read(fd, 1024) # read available             except OSError as e:                 if e.errno != errno.EIO:                     raise #XXX cleanup                 del readable[fd] # EIO means EOF on some systems             else:                 if not data: # EOF                     del readable[fd]                 else:                     readable[fd].write(data)                     readable[fd].flush() for fd in masters:     os.close(fd) 

    I don't know what are the side-effects of using different ptys for stdout, stderr. You could try whether a single pty is enough in your case e.g., set stderr=PIPE and use p.stderr.fileno() instead of masters[1]. Comment in sh source suggests that there are issues if stderr not in {STDOUT, pipe}

like image 151
jfs Avatar answered Sep 20 '22 14:09

jfs


If you want to read from stderr and stdout and get the output separately, you can use a Thread with a Queue, not overly tested but something like the following:

import threading import queue  def run(fd, q):     for line in iter(fd.readline, ''):         q.put(line)     q.put(None)   def create(fd):     q = queue.Queue()     t = threading.Thread(target=run, args=(fd, q))     t.daemon = True     t.start()     return q, t   process = Popen(["curl","www.google.com"], stdout=PIPE, stderr=PIPE,                 universal_newlines=True)  std_q, std_out = create(process.stdout) err_q, err_read = create(process.stderr)  while std_out.is_alive() or err_read.is_alive():         for line in iter(std_q.get, None):             print(line)         for line in iter(err_q.get, None):             print(line) 
like image 40
Padraic Cunningham Avatar answered Sep 18 '22 14:09

Padraic Cunningham