Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

on-the-fly output redirection, seeing the file redirection output while the program is still running

Tags:

linux

bash

If I use a command like this one:
./program >> a.txt &
, and the program is a long running one then I can only see the output once the program ended. That means I have no way of knowing if the computation is going well until it actually stops computing. I want to be able to read the redirected output on file while the program is running.

This is similar to opening a file, appending to it, then closing it back after every writing. If the file is only closed at the end of the program then no data can be read on it until the program ends. The only redirection I know is similar to closing the file at the end of the program.

You can test it with this little python script. The language doesn't matter. Any program that writes to standard output has the same problem.

l = range(0,100000)
for i in l:
  if i%1000==0:
    print i
  for j in l:
    s = i + j

One can run this with:
./python program.py >> a.txt &
Then cat a.txt .. you will only get results once the script is done computing.

like image 541
grokkaine Avatar asked Mar 31 '11 13:03

grokkaine


1 Answers

From the stdout manual page:

The stream stderr is unbuffered. The stream stdout is line-buffered when it points to a terminal. Partial lines will not appear until fflush(3) or exit(3) is called, or a new‐line is printed.

Bottom line: Unless the output is a terminal, your program will have its standard output in fully buffered mode by default. This essentially means that it will output data in large-ish blocks, rather than line-by-line, let alone character-by-character.

Ways to work around this:

  • Fix your program: If you need real-time output, you need to fix your program. In C you can use fflush(stdout) after each output statement, or setvbuf() to change the buffering mode of the standard output. For Python there is sys.stdout.flush() of even some of the suggestions here.

  • Use a utility that can record from a PTY, rather than outright stdout redirections. GNU Screen can do this for you:

    screen -d -m -L python test.py
    

    would be a start. This will log the output of your program to a file called screenlog.0 (or similar) in your current directory with a default delay of 10 seconds, and you can use screen to connect to the session where your command is running to provide input or terminate it. The delay and the name of the logfile can be changed in a configuration file or manually once you connect to the background session.

EDIT:

On most Linux system there is a third workaround: You can use the LD_PRELOAD variable and a preloaded library to override select functions of the C library and use them to set the stdout buffering mode when those functions are called by your program. This method may work, but it has a number of disadvantages:

  • It won't work at all on static executables

  • It's fragile and rather ugly.

  • It won't work at all with SUID executables - the dynamic loader will refuse to read the LD_PRELOAD variable when loading such executables for security reasons.

  • It's fragile and rather ugly.

  • It requires that you find and override a library function that is called by your program after it initially sets the stdout buffering mode and preferably before any output. getenv() is a good choice for many programs, but not all. You may have to override common I/O functions such as printf() or fwrite() - if push comes to shove you may just have to override all functions that control the buffering mode and introduce a special condition for stdout.

  • It's fragile and rather ugly.

  • It's hard to ensure that there are no unwelcome side-effects. To do this right you'd have to ensure that only stdout is affected and that your overrides will not crash the rest of the program if e.g. stdout is closed.

  • Did I mention that it's fragile and rather ugly?

That said, the process it relatively simple. You put in a C file, e.g. linebufferedstdout.c the replacement functions:

#define _GNU_SOURCE
#include <stdlib.h>
#include <stdio.h>
#include <dlfcn.h>


char *getenv(const char *s) {
    static char *(*getenv_real)(const char *s) = NULL;

    if (getenv_real == NULL) {
        getenv_real = dlsym(RTLD_NEXT, "getenv");

        setlinebuf(stdout);
    }

    return getenv_real(s);
}

Then you compile that file as a shared object:

gcc -O2 -o linebufferedstdout.so -fpic -shared linebufferedstdout.c -ldl -lc

Then you set the LD_PRELOAD variable to load it along with your program:

$ LD_PRELOAD=./linebufferedstdout.so python test.py | tee -a test.out 
0
1000
2000
3000
4000

If you are lucky, your problem will be solved with no unfortunate side-effects.

You can set the LD_PRELOAD library in the shell, if necessary, or even specify that library system-wide (definitely NOT recommended) in /etc/ld.so.preload.

like image 107
thkala Avatar answered Oct 22 '22 13:10

thkala