I know about <pre class="prettyprint"><code>program1 | program2 </code></pre> and <pre class="prettyprint"><code>program1 | tee outputfile | program2 </code></pre> but is there a way to feed program1's output into both program2 and program3?

You can do this with <code>tee</code> and process substitution. <pre class="prettyprint"><code>program1 | tee >(program2) >(program3) </code></pre> The output of <code>program1</code> will be piped to whatever is inside <code>( )</code>, in this case <code>program2</code> and <code>program3</code>.

<h3>Intro about parallelisation </h3> This seem trivial, but doing this is not only possible, also doing so will generate concurrent or simultaneous process. You may have to take care about some particular effects, like order of execution, exection time, etc. There are some sample at end of this post. <h3>Compatible answer first</h3> As this question is flagged shell and unix, I will first give a POSIX compatible answer. (for bashisms, go further.) Yes, there is a way to use unnamed pipes. In this sample, I will generate a range of 100'000 numbers, randomize them and compress the result using 4 different compression tools to compare the compression ratio... For this to I will first run the preparation: <pre class="prettyprint"><code>GZIP_CMD=`which gzip` BZIP2_CMD=`which bzip2` LZMA_CMD=`which lzma` XZ_CMD=`which xz` MD5SUM_CMD=`which md5sum` SED_CMD=`which sed` </code></pre> Note: specifying full path to commands prevent some shell interpreter (like busybox) to run built-in compressor. And doing way will ensure same syntax will run independently of os installation (paths could be different between MacOs, Ubuntu, RedHat, HP-Ux and so...). The syntax <code>NN>&1</code> (where NN is a number between 3 and 63) do generate unnamed pipe who could by find at <code>/dev/fd/NN</code>. (The file descriptors 0 to 2 are already open for 0: STDIN, 1: STDOUT and 2: STDERR). Try this (tested under dash, busybox and bash) : <pre class="prettyprint"><code>(((( seq 1 100000 | shuf | tee /dev/fd/4 /dev/fd/5 /dev/fd/6 /dev/fd/7 | $GZIP_CMD >/tmp/tst.gz ) 4>&1 | $BZIP2_CMD >/tmp/tst.bz2 ) 5>&1 | $LZMA_CMD >/tmp/tst.lzma ) 6>&1 | $XZ_CMD >/tmp/tst.xz ) 7>&1 | $MD5SUM_CMD </code></pre> or more readable: <pre class="prettyprint"><code>GZIP_CMD=`which gzip` BZIP2_CMD=`which bzip2` LZMA_CMD=`which lzma` XZ_CMD=`which xz` MD5SUM_CMD=`which md5sum` ( ( ( ( seq 1 100000 | shuf | tee /dev/fd/4 /dev/fd/5 /dev/fd/6 /dev/fd/7 | $GZIP_CMD >/tmp/tst.gz ) 4>&1 | $BZIP2_CMD >/tmp/tst.bz2 ) 5>&1 | $LZMA_CMD >/tmp/tst.lzma ) 6>&1 | $XZ_CMD >/tmp/tst.xz ) 7>&1 | $MD5SUM_CMD 2e67f6ad33745dc5134767f0954cbdd6 - </code></pre> As <code>shuf</code> do random placement, if you try this, you must obtain different result, <pre class="prettyprint"><code>ls -ltrS /tmp/tst.* -rw-r--r-- 1 user user 230516 oct 1 22:14 /tmp/tst.bz2 -rw-r--r-- 1 user user 254811 oct 1 22:14 /tmp/tst.lzma -rw-r--r-- 1 user user 254892 oct 1 22:14 /tmp/tst.xz -rw-r--r-- 1 user user 275003 oct 1 22:14 /tmp/tst.gz </code></pre> but you must be able to compare md5 checksums: <pre class="prettyprint"><code>SED_CMD=`which sed` for chk in gz:$GZIP_CMD bz2:$BZIP2_CMD lzma:$LZMA_CMD xz:$XZ_CMD;do ${chk#*:} -d < /tmp/tst.${chk%:*} | $MD5SUM_CMD | $SED_CMD s/-$/tst.${chk%:*}/ done 2e67f6ad33745dc5134767f0954cbdd6 tst.gz 2e67f6ad33745dc5134767f0954cbdd6 tst.bz2 2e67f6ad33745dc5134767f0954cbdd6 tst.lzma 2e67f6ad33745dc5134767f0954cbdd6 tst.xz </code></pre> <h3>Using bash features</h3> Using some bashims, this could look nicer, for sample use <code>/dev/fd/{4,5,6,7}</code>, instead of <code>tee /dev/fd/4 /dev/fd/5 /...</code> <pre class="prettyprint"><code>(((( seq 1 100000 | shuf | tee /dev/fd/{4,5,6,7} | gzip >/tmp/tst.gz ) 4>&1 | bzip2 >/tmp/tst.bz2 ) 5>&1 | lzma >/tmp/tst.lzma ) 6>&1 | xz >/tmp/tst.xz ) 7>&1 | md5sum 29078875555e113b31bd1ae876937d4b - </code></pre> will work same. <h3>Final check</h3> This won't create any file, but would let you compare size of a compressed range of sorted integers, between 4 different compression tool (for fun, I used 4 different way for formatting output): <pre class="prettyprint"><code>( ( ( ( ( seq 1 100000 | tee /dev/fd/{4,5,6,7} | gzip | wc -c | sed s/^/gzip:\ \ / >&3 ) 4>&1 | bzip2 | wc -c | xargs printf "bzip2: %s\n" >&3 ) 5>&1 | lzma | wc -c | perl -pe 's/^/lzma: /' >&3 ) 6>&1 | xz | wc -c | awk '{printf "xz: %9s\n",$1}' >&3 ) 7>&1 | wc -c ) 3>&1 gzip: 215157 bzip2: 124009 lzma: 17948 xz: 17992 588895 </code></pre> This demonstrate how to use stdin and stdout redirected in subshell and merged in console for final output. <h3>Syntax <code>>(...)</code> and <code><(...)</code> </h3> Recent bash versions permit a new syntax feature. <pre class="prettyprint"><code>seq 1 100000 | wc -l 100000 seq 1 100000 > >( wc -l ) 100000 wc -l < <( seq 1 100000 ) 100000 </code></pre> As <code>|</code> is an unnamed pipe to <code>/dev/fd/0</code>, the syntax <code><()</code> do generate temporary unnamed pipe with others file descriptor <code>/dev/fd/XX</code>. <pre class="prettyprint"><code>md5sum <(zcat /tmp/tst.gz) <(bzcat /tmp/tst.bz2) <( lzcat /tmp/tst.lzma) <(xzcat /tmp/tst.xz) 29078875555e113b31bd1ae876937d4b /dev/fd/63 29078875555e113b31bd1ae876937d4b /dev/fd/62 29078875555e113b31bd1ae876937d4b /dev/fd/61 29078875555e113b31bd1ae876937d4b /dev/fd/60 </code></pre> <h3>More sophisticated demo</h3> This require GNU <code>file</code> utility to be installed. Will determine command to be run by extension or file type. <pre class="prettyprint"><code>for file in /tmp/tst.*;do cmd=$(which ${file##*.}) || { cmd=$(file -b --mime-type $file) cmd=$(which ${cmd#*-}) } read -a md5 < <($cmd -d <$file|md5sum) echo $md5 \ $file done 29078875555e113b31bd1ae876937d4b /tmp/tst.bz2 29078875555e113b31bd1ae876937d4b /tmp/tst.gz 29078875555e113b31bd1ae876937d4b /tmp/tst.lzma 29078875555e113b31bd1ae876937d4b /tmp/tst.xz </code></pre> This let you do same previous thing by following syntax: <pre class="prettyprint"><code>seq 1 100000 | shuf | tee >( echo gzip. $( gzip | wc -c ) ) >( echo gzip, $( wc -c < <(gzip)) ) >( gzip | wc -c | sed s/^/gzip:\ \ / ) >( bzip2 | wc -c | xargs printf "bzip2: %s\n" ) >( lzma | wc -c | perl -pe 's/^/lzma: /' ) >( xz | wc -c | awk '{printf "xz: %9s\n",$1}' ) > >( echo raw: $(wc -c) ) | xargs printf "%-8s %9d\n" raw: 588895 xz: 254556 lzma: 254472 bzip2: 231111 gzip: 274867 gzip, 274867 gzip. 274867 </code></pre> Note I used different way used to compute <code>gzip</code> compressed count. Note Because this operation was done simultaneously, output order will depend on time required by each command. <h3>Going further about parallelisation</h3> If you run some multi-core or multi-processor computer, try to compare this: <pre class="prettyprint"><code>i=1 time for file in /tmp/tst.*;do cmd=$(which ${file##*.}) || { cmd=$(file -b --mime-type $file) cmd=$(which ${cmd#*-}) } read -a md5 < <($cmd -d <$file|md5sum) echo $((i++)) $md5 \ $file done | cat -n </code></pre> wich may render: <pre class="prettyprint"><code> 1 1 29078875555e113b31bd1ae876937d4b /tmp/tst.bz2 2 2 29078875555e113b31bd1ae876937d4b /tmp/tst.gz 3 3 29078875555e113b31bd1ae876937d4b /tmp/tst.lzma 4 4 29078875555e113b31bd1ae876937d4b /tmp/tst.xz real 0m0.101s </code></pre> with this: <pre class="prettyprint"><code>time ( i=1 pids=() for file in /tmp/tst.*;do cmd=$(which ${file##*.}) || { cmd=$(file -b --mime-type $file) cmd=$(which ${cmd#*-}) } ( read -a md5 < <($cmd -d <$file|md5sum) echo $i $md5 \ $file ) & pids+=($!) ((i++)) done wait ${pids[@]} ) | cat -n </code></pre> could give: <pre class="prettyprint"><code> 1 2 29078875555e113b31bd1ae876937d4b /tmp/tst.gz 2 1 29078875555e113b31bd1ae876937d4b /tmp/tst.bz2 3 4 29078875555e113b31bd1ae876937d4b /tmp/tst.xz 4 3 29078875555e113b31bd1ae876937d4b /tmp/tst.lzma real 0m0.070s </code></pre> where ordering depend on type used by each fork.

OS X / Linux: pipe into two processes?

Tags:

I know about

program1 | program2

and

program1 | tee outputfile | program2

but is there a way to feed program1's output into both program2 and program3?

996

asked Apr 18 '12 21:04

Jason S

2 Answers

You can do this with tee and process substitution.

program1 | tee >(program2) >(program3)

The output of program1 will be piped to whatever is inside ( ), in this case program2 and program3.

answered Oct 27 '22 20:10

inspector-g

Intro about parallelisation

This seem trivial, but doing this is not only possible, also doing so will generate concurrent or simultaneous process.

You may have to take care about some particular effects, like order of execution, exection time, etc.

There are some sample at end of this post.

Compatible answer first

As this question is flagged shell and unix, I will first give a POSIX compatible answer. (for bashisms, go further.)

Yes, there is a way to use unnamed pipes.

In this sample, I will generate a range of 100'000 numbers, randomize them and compress the result using 4 different compression tools to compare the compression ratio...

For this to I will first run the preparation:

GZIP_CMD=`which gzip` BZIP2_CMD=`which bzip2` LZMA_CMD=`which lzma` XZ_CMD=`which xz` MD5SUM_CMD=`which md5sum` SED_CMD=`which sed`

Note: specifying full path to commands prevent some shell interpreter (like busybox) to run built-in compressor. And doing way will ensure same syntax will run independently of os installation (paths could be different between MacOs, Ubuntu, RedHat, HP-Ux and so...).

The syntax NN>&1 (where NN is a number between 3 and 63) do generate unnamed pipe who could by find at /dev/fd/NN. (The file descriptors 0 to 2 are already open for 0: STDIN, 1: STDOUT and 2: STDERR).

Try this (tested under dash, busybox and bash) :

(((( seq 1 100000 | shuf | tee /dev/fd/4 /dev/fd/5 /dev/fd/6 /dev/fd/7 | $GZIP_CMD >/tmp/tst.gz ) 4>&1 | $BZIP2_CMD >/tmp/tst.bz2 ) 5>&1 | $LZMA_CMD >/tmp/tst.lzma ) 6>&1 | $XZ_CMD >/tmp/tst.xz ) 7>&1 | $MD5SUM_CMD

or more readable:

GZIP_CMD=`which gzip` BZIP2_CMD=`which bzip2` LZMA_CMD=`which lzma` XZ_CMD=`which xz` MD5SUM_CMD=`which md5sum`  (   (     (       (         seq 1 100000 |           shuf |           tee /dev/fd/4 /dev/fd/5 /dev/fd/6 /dev/fd/7 |           $GZIP_CMD >/tmp/tst.gz       ) 4>&1 |         $BZIP2_CMD >/tmp/tst.bz2     ) 5>&1 |       $LZMA_CMD >/tmp/tst.lzma   ) 6>&1 |     $XZ_CMD >/tmp/tst.xz ) 7>&1 |   $MD5SUM_CMD 2e67f6ad33745dc5134767f0954cbdd6  -

As shuf do random placement, if you try this, you must obtain different result,

ls -ltrS /tmp/tst.* -rw-r--r-- 1 user user 230516 oct  1 22:14 /tmp/tst.bz2 -rw-r--r-- 1 user user 254811 oct  1 22:14 /tmp/tst.lzma -rw-r--r-- 1 user user 254892 oct  1 22:14 /tmp/tst.xz -rw-r--r-- 1 user user 275003 oct  1 22:14 /tmp/tst.gz

but you must be able to compare md5 checksums:

SED_CMD=`which sed`  for chk in gz:$GZIP_CMD bz2:$BZIP2_CMD lzma:$LZMA_CMD xz:$XZ_CMD;do     ${chk#*:} -d < /tmp/tst.${chk%:*} |         $MD5SUM_CMD |         $SED_CMD s/-$/tst.${chk%:*}/   done 2e67f6ad33745dc5134767f0954cbdd6  tst.gz 2e67f6ad33745dc5134767f0954cbdd6  tst.bz2 2e67f6ad33745dc5134767f0954cbdd6  tst.lzma 2e67f6ad33745dc5134767f0954cbdd6  tst.xz

Using bash features

Using some bashims, this could look nicer, for sample use /dev/fd/{4,5,6,7}, instead of tee /dev/fd/4 /dev/fd/5 /...

(((( seq 1 100000 | shuf | tee /dev/fd/{4,5,6,7} | gzip >/tmp/tst.gz ) 4>&1 |    bzip2 >/tmp/tst.bz2 ) 5>&1 | lzma >/tmp/tst.lzma ) 6>&1 |    xz >/tmp/tst.xz ) 7>&1 | md5sum 29078875555e113b31bd1ae876937d4b  -

will work same.

Final check

This won't create any file, but would let you compare size of a compressed range of sorted integers, between 4 different compression tool (for fun, I used 4 different way for formatting output):

(   (     (       (         (           seq 1 100000 |             tee /dev/fd/{4,5,6,7} |               gzip |               wc -c |               sed s/^/gzip:\ \ / >&3         ) 4>&1 |           bzip2 |           wc -c |           xargs printf "bzip2: %s\n" >&3       ) 5>&1 |         lzma |         wc -c |         perl -pe 's/^/lzma:   /' >&3     ) 6>&1 |       xz |       wc -c |       awk '{printf "xz: %9s\n",$1}' >&3   ) 7>&1 |     wc -c ) 3>&1 gzip:  215157 bzip2: 124009 lzma:   17948 xz:     17992 588895

This demonstrate how to use stdin and stdout redirected in subshell and merged in console for final output.

Syntax `>(...)` and `<(...)`

Recent bash versions permit a new syntax feature.

seq 1 100000 | wc -l 100000  seq 1 100000 > >( wc -l ) 100000  wc -l < <( seq 1 100000 ) 100000

As | is an unnamed pipe to /dev/fd/0, the syntax <() do generate temporary unnamed pipe with others file descriptor /dev/fd/XX.

md5sum <(zcat /tmp/tst.gz) <(bzcat /tmp/tst.bz2) <(          lzcat /tmp/tst.lzma) <(xzcat /tmp/tst.xz) 29078875555e113b31bd1ae876937d4b  /dev/fd/63 29078875555e113b31bd1ae876937d4b  /dev/fd/62 29078875555e113b31bd1ae876937d4b  /dev/fd/61 29078875555e113b31bd1ae876937d4b  /dev/fd/60

More sophisticated demo

This require GNU file utility to be installed. Will determine command to be run by extension or file type.

for file in /tmp/tst.*;do     cmd=$(which ${file##*.}) || {         cmd=$(file -b --mime-type $file)         cmd=$(which ${cmd#*-})     }     read -a md5 < <($cmd -d <$file|md5sum)     echo $md5 \ $file   done 29078875555e113b31bd1ae876937d4b  /tmp/tst.bz2 29078875555e113b31bd1ae876937d4b  /tmp/tst.gz 29078875555e113b31bd1ae876937d4b  /tmp/tst.lzma 29078875555e113b31bd1ae876937d4b  /tmp/tst.xz

This let you do same previous thing by following syntax:

seq 1 100000 |     shuf |         tee >(             echo gzip. $( gzip | wc -c )           )  >(             echo gzip, $( wc -c < <(gzip))           ) >(             gzip  | wc -c | sed s/^/gzip:\ \ /           ) >(             bzip2 | wc -c | xargs printf "bzip2: %s\n"           ) >(             lzma  | wc -c | perl -pe 's/^/lzma:  /'           ) >(             xz    | wc -c | awk '{printf "xz: %9s\n",$1}'           ) > >(             echo raw: $(wc -c)           ) |         xargs printf "%-8s %9d\n"  raw:        588895 xz:         254556 lzma:       254472 bzip2:      231111 gzip:       274867 gzip,       274867 gzip.       274867

Note I used different way used to compute gzip compressed count.

Note Because this operation was done simultaneously, output order will depend on time required by each command.

Going further about parallelisation

If you run some multi-core or multi-processor computer, try to compare this:

i=1 time for file in /tmp/tst.*;do     cmd=$(which ${file##*.}) || {         cmd=$(file -b --mime-type $file)         cmd=$(which ${cmd#*-})     }     read -a md5 < <($cmd -d <$file|md5sum)     echo $((i++)) $md5 \ $file   done | cat -n

wich may render:

     1      1 29078875555e113b31bd1ae876937d4b  /tmp/tst.bz2      2      2 29078875555e113b31bd1ae876937d4b  /tmp/tst.gz      3      3 29078875555e113b31bd1ae876937d4b  /tmp/tst.lzma      4      4 29078875555e113b31bd1ae876937d4b  /tmp/tst.xz  real    0m0.101s

with this:

time  (     i=1 pids=()     for file in /tmp/tst.*;do         cmd=$(which ${file##*.}) || {             cmd=$(file -b --mime-type $file)             cmd=$(which ${cmd#*-})         }         (              read -a md5 < <($cmd -d <$file|md5sum)              echo $i $md5 \ $file         ) & pids+=($!)       ((i++))       done     wait ${pids[@]} ) | cat -n

could give:

     1      2 29078875555e113b31bd1ae876937d4b  /tmp/tst.gz      2      1 29078875555e113b31bd1ae876937d4b  /tmp/tst.bz2      3      4 29078875555e113b31bd1ae876937d4b  /tmp/tst.xz      4      3 29078875555e113b31bd1ae876937d4b  /tmp/tst.lzma  real    0m0.070s

where ordering depend on type used by each fork.

answered Oct 27 '22 20:10

F. Hauri

Related questions
                            
                                Sorting an Array of Hash by multiple keys Perl
                            
                                Setting paper size in FPDF
                            
                                How can I include external jar on my Netbeans project
                            
                                AngularJS : $scope.$watch is not updating value fetched from $resource on custom directive
                            
                                checking if file exists in a specific directory
                            
                                Copy & rename a file to the same directory without deleting the original file [duplicate]
                            
                                Ascii code for less than or equal to
                            
                                Set NSWindow Size programmatically
                            
                                Python: Way to speed up a repeatedly executed eval statement?
                            
                                Trigger $(window).scroll();
                            
                                How to use c union nested in struct with no name
                            
                                how do i let text fit to UIButton?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

OS X / Linux: pipe into two processes?

Tags:

Jason S

People also ask

2 Answers

inspector-g

Intro about parallelisation

Compatible answer first

Using bash features

Final check

Syntax `>(...)` and `<(...)`

More sophisticated demo

Going further about parallelisation

F. Hauri

Recent Activity

Donate For Us

OS X / Linux: pipe into two processes?

Tags:

Jason S

People also ask

2 Answers

inspector-g

Intro about parallelisation

Compatible answer first

Using bash features

Final check

Syntax >(...) and <(...)

More sophisticated demo

Going further about parallelisation

F. Hauri

Related questions

Recent Activity

Donate For Us

Syntax `>(...)` and `<(...)`