Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

xargs output buffering -P parallel

Tags:

bash

shell

sed

awk

I have a bash function that i call in parallel using xargs -P like so

 echo ${list} | xargs -n 1 -P 24 -I@ bash -l -c 'myAwesomeShellFunction @'

Everything works fine but output is messed up for obvious reasons (no buffering)

Trying to figure out a way to buffer output effectively. I was thinking I could use awk, but I'm not good enough to write such a script and I can't find anything worthwhile on google? Can someone help me write this "output buffer" in sed or awk? Nothing fancy, just accumulate output and spit it out after process terminates. I don't care the order that shell functions execute, just need their output buffered... Something like:

 echo ${list} | xargs -n 1 -P 24 -I@ bash -l -c 'myAwesomeShellFunction @ | sed -u ""'

P.s. I tried to use stdbuf as per https://unix.stackexchange.com/questions/25372/turn-off-buffering-in-pipe but did not work, i specified buffering on o and e but output still unbuffered:

 echo ${list} | xargs -n 1 -P 24 -I@ stdbuf -i0 -oL -eL bash -l -c 'myAwesomeShellFunction @'

Here's my first attempt, this only captures first line of output:

 $ bash -c "echo stuff;sleep 3; echo more stuff" | awk '{while (( getline line) > 0 )print "got ",$line;}'
 $ got  stuff
like image 892
niken Avatar asked Jun 15 '17 14:06

niken


1 Answers

This isn't quite atomic if your output is longer than a page (4kb typically), but for most cases it'll do:

xargs -P 24 bash -c 'for arg; do printf "%s\n" "$(myAwesomeShellFunction "$arg")"; done' _

The magic here is the command substitution: $(...) creates a subshell (a fork()ed-off copy of your shell), runs the code ... in it, and then reads that in to be substituted into the relevant position in the outer script.

Note that we don't need -n 1 (if you're dealing with a large number of arguments -- for a small number it may improve parallelization), since we're iterating over as many arguments as each of your 24 parallel bash instances is passed.


If you want to make it truly atomic, you can do that with a lockfile:

# generate a lockfile, arrange for it to be deleted when this shell exits
lockfile=$(mktemp -t lock.XXXXXX); export lockfile
trap 'rm -f "$lockfile"' 0

xargs -P 24 bash -c '
  for arg; do
    {
      output=$(myAwesomeShellFunction "$arg")
      flock -x 99
      printf "%s\n" "$output"
    } 99>"$lockfile"
  done
' _
like image 103
Charles Duffy Avatar answered Sep 17 '22 18:09

Charles Duffy