Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

grep library output from within Python

When calling a program from the command line, I can pipe the output to grep to select the lines I want to see, e.g.

printf "hello\ngood day\nfarewell\n" | grep day

I am in search for the same kind of line selection, but for a C library called from Python. Consider the following example:

import os

# Function which emulate a C library call
def call_library():
    os.system('printf "hello\ngood day\nfarewell\n"')

# Pure Python stuff
print('hello from Python')
# C library stuff
call_library()

When running this Python code, I want the output of the C part to be grep'ed for the string 'day', making the output of the code

hello from Python
good day

So far I has fiddled around with redirection of stdout, using the methods described here and here. I am able to make the C output vanish completely, or save it to a str and print it out later (which is what the two links are mainly concerned with). I am not however able to select which lines get printed based on its content. Importantly, I want the output in real time while the C library is being called, so I cannot just redirect stdout to some buffer and do some processing on this buffer after the fact.

The solution need only to work with Python 3.x on Linux. If in addition to line selection, the solution makes it possible for line editing, that would be even greater.

I think the following should be possible, but I do not know how to set it up

  • Redirect stdout to a "file" in memory.

  • Spawn a new thread which constantly reads from this file, does the selection based on line content, and writes the wanted lines to the screen, i.e. the original destination of stdout.

  • Call the C library

  • Join the two threads back together and redirect stdout back to its original destination (the screen).

I do not have a firm enough grasp of file descriptors and the like to be able to do this, nor to even know if this is the best way of doing it.

Edit

Note that the solution cannot simply re-implement the code in call_library. The code must call call_library, totally agnostic to the actual code which then gets executed.

like image 261
jmd_dk Avatar asked Nov 27 '17 18:11

jmd_dk


1 Answers

I'm a little confused about exactly what your program is doing, but it sounds like you have a C library that writes to the C stdout (not the Python sys.stdout) and you want to capture this output and postprocess it, and you already have a Python binding for the C library, which you would prefer to use rather than a separate C program.

First off, you must use a child process to do this; nothing else will work reliably. This is because stdout is process-global, so there's no reliable way to capture only one thread's writes to stdout.

Second off, you can use subprocess.Popen, because you can re-invoke the current script using it! This is what the Python multiprocessing module does under the hood, and it's not terribly hard to do yourself. I would use a special, hidden command line argument to distinguish the child, like this:

import argparse
import subprocess
import sys

def subprocess_call_c_lib():
    import c_lib
    c_lib.do_stuff()

def invoke_c_lib():
    proc = subprocess.Popen([sys.executable, __file__,
                             "--internal-subprocess-call-c-lib"
                             # , ...
                             ],
                            stdin=subprocess.DEVNULL,
                            stdout=subprocess.PIPE)
    for line in proc.stdout:
        # filter output from the library here
        # to display to "screen", write to sys.stdout as usual

    if proc.wait():
        raise subprocess.CalledProcessError(proc.returncode, "c_lib")

def main():
    ap = argparse.Parser(...)
    ap.add_argument("--internal-subprocess-call-c-lib", action="store_true",
                    help=argparse.SUPPRESS)
    # ... more arguments ...

    args = ap.parse_args()
    if args.internal_subprocess_call_c_lib:
        subprocess_call_c_lib()
        sys.exit(0)

    # otherwise, proceed as before ...

main()
like image 188
zwol Avatar answered Oct 11 '22 23:10

zwol