Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to capture inputs and outputs of a child process?

I'm trying to make a program which takes an executable name as an argument, runs the executable and reports the inputs and outputs for that run. For example consider a child program named "circle". The following would be desired run for my program:

$ python3 capture_io.py ./circle
Enter radius of circle: 10
Area: 314.158997
[('output', 'Enter radius of circle: '), ('input',  '10\n'), ('output', 'Area: 314.158997\n')]

I decided to use pexpect module for this job. It has a method called interact which lets the user interact with the child program as seen above. It also takes 2 optional parameters: output_filter and input_filter. From the documentation:

The output_filter will be passed all the output from the child process. The input_filter will be passed all the keyboard input from the user.

So this is the code I wrote:

capture_io.py

import sys
import pexpect

_stdios = []


def read(data):
    _stdios.append(("output", data.decode("utf8")))
    return data


def write(data):
    _stdios.append(("input", data.decode("utf8")))
    return data


def capture_io(argv):
    _stdios.clear()
    child = pexpect.spawn(argv)
    child.interact(input_filter=write, output_filter=read)
    child.wait()
    return _stdios


if __name__ == '__main__':
    stdios_of_child = capture_io(sys.argv[1:])
    print(stdios_of_child)

circle.c

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char* argv[]) {
    float radius, area;

    printf("Enter radius of circle: ");
    scanf("%f", &radius);

    if (radius < 0) {
        fprintf(stderr, "Negative radius values are not allowed.\n");
        exit(1);
    }

    area = 3.14159 * radius * radius;
    printf("Area: %f\n", area);
    return 0;
}

Which produces the following output:

$ python3 capture_io.py ./circle
Enter radius of circle: 10
Area: 314.158997
[('output', 'Enter radius of circle: '), ('input', '1'), ('output', '1'), ('input', '0'), ('output', '0'), ('input', '\r'), ('output', '\r\n'), ('output', 'Area: 314.158997\r\n')]

As you can observe from the output, input is processed character by character and also echoed back as output which creates such a mess. Is it possible to change this behaviour so that my input_filter will run only when Enter is pressed?

Or more generally, what would be the best way to achieve my goal (with or without pexpect)?

like image 925
Asocia Avatar asked Jun 09 '20 17:06

Asocia


1 Answers

When I started to write a helper, I realized that the main problem is that the input should be logged line buffered, so the backspace and other editing is done before the input reaches the program, but the output should be unbuffered in order to log the prompt which is not terminated by a new line.

To capture the output for the purpose of logging, a pipe is needed, but that automatically turns on line buffering. It is known that a pseudoterminal solves the problem (the expect module is built around a pseudoterminal), but a terminal has both the input and the output and we want to unbuffer only the output.

Fortunately there is the stdbuf utility. On Linux it alters the C library functions of dynamically linked executables. Not universally usable.

I have modified a Python bidirectional copy program to log the data it copies. Combined with the stdbuf it produces the desired output.

import select
import os

STDIN = 0
STDOUT = 1

BUFSIZE = 4096

def main(cmd):
    ipipe_r, ipipe_w = os.pipe()
    opipe_r, opipe_w = os.pipe()
    if os.fork():
        # parent
        os.close(ipipe_r)
        os.close(opipe_w)
        fdlist_r = [STDIN, opipe_r]
        while True:
            ready_r, _, _ = select.select(fdlist_r, [], []) 
            if STDIN in ready_r:
                # STDIN -> program
                data = os.read(STDIN, BUFSIZE)
                if data:
                    yield('in', data)   # optional: convert to str
                    os.write(ipipe_w, data)
                else:
                    # send EOF
                    fdlist_r.remove(STDIN)
                    os.close(ipipe_w)
            if opipe_r in ready_r:
                # program -> STDOUT
                data = os.read(opipe_r, BUFSIZE)
                if not data:
                    # got EOF
                    break
                yield('out', data)
                os.write(STDOUT, data)
        os.wait()
    else:
        # child
        os.close(ipipe_w)
        os.close(opipe_r)
        os.dup2(ipipe_r, STDIN)
        os.dup2(opipe_w, STDOUT)
        os.execlp(*cmd)
        # not reached
        os._exit(127)

if __name__ == '__main__':
    log = list(main(['stdbuf', 'stdbuf', '-o0', './circle']))
    print(log)

It prints:

[('out', b'Enter radius of circle: '), ('in', b'12\n'), ('out', b'Area: 452.388947\n')]
like image 61
VPfB Avatar answered Oct 19 '22 21:10

VPfB