Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bash process substitution and syncing

Tags:

(Possibly related to Do some programs not accept process substitution for input files?)

In some Bash unit test scripts I'm using the following trick to log and display stdout and stderr of a command:

command > >(tee "${stdoutF}") 2> >(tee "${stderrF}" >&2)

This process produces some output to stdout, so the $stdoutF file gets some data. Then I run another command which does not output any data:

diff -r "$source" "$target" > >(tee "${stdoutF}") 2> >(tee "${stderrF}" >&2)

However, it doesn't look like this process always finishes successfully before the test for emptiness is run (using shunit-ng):

assertNull 'Unexpected output to stdout' "$(<"$stdoutF")"

In a 100 run test this failed 25 times.

Should it be sufficient to call sync before testing the file for emptiness:

sync
assertNull 'Unexpected output to stdout' "$(<"$stdoutF")"

... and/or should it work by forcing the sequence of the commands:

diff -r "$source" "$target" \
> >(tee "${stdoutF}"; assertNull 'Unexpected output to stdout' "$(<"$stdoutF")")
2> >(tee "${stderrF}" >&2)

... and/or is it possible to tee it somehow to assertNull directly instead of a file?

Update: sync is not the answer - See Gilles' response below.

Update 2: Discussion taken further to Save stdout, stderr and stdout+stderr synchronously. Thanks for the answers!

like image 711
l0b0 Avatar asked Dec 20 '10 11:12

l0b0


1 Answers

In bash, a process substitution substitution command foo > >(bar) finishes as soon as foo finishes. (This is not discussed in the documentation.) You can check this with

: > >(sleep 1; echo a)

This command returns immediately, then prints a asynchronously one second later.

In your case, the tee command takes just one little bit of time to finish after command completes. Adding sync gave tee enough time to complete, but this doesn't remove the race condition, any more than adding a sleep would, it just makes the race more unlikely to manifest.

More generally, sync does not have any internally observable effect: it only makes a difference if you want to access device where your filesystems are stored under a different operating system instance. In clearer terms, if your system loses power, only data written before the last sync is guaranteed to be available after you reboot.

As for removing the race condition, here are a few of possible approaches:

  • Explicitly synchronize all substituted processes.

    mkfifo sync.pipe
    command > >(tee -- "$stdoutF"; echo >sync.pipe)
           2> >(tee -- "$stderrF"; echo >sync.pipe)
    read line < sync.pipe; read line < sync.pipe
    
  • Use a different temporary file name for each command instead of reusing $stdoutF and $stderrF, and enforce that the temporary file is always newly created.

  • Give up on process substitution and use pipes instead.

    { { command | tee -- "$stdoutF" 1>&3; } 2>&1 \
                | tee -- "$stderrF" 1>&2; } 3>&1
    

    If you need the command's return status, bash puts it in ${PIPESTATUS[0]}.

    { { command | tee -- "$stdoutF" 1>&3; exit ${PIPESTATUS[0]}; } 2>&1 \
                | tee -- "$stderrF" 1>&2; } 3>&1
    if [ ${PIPESTATUS[0]} -ne 0 ]; then echo command failed; fi
    
like image 63
Gilles 'SO- stop being evil' Avatar answered Oct 12 '22 17:10

Gilles 'SO- stop being evil'