Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Asynchronously consuming pipe with bash

I have a bash script like this

data_generator_that_never_guits | while read data 
do
 an_expensive_process_with data
done

The first process continuously generates events (at irregular intervals) which needs to be processed as they become available. A problem with this script is that read on consumes a single line of the output; and as the processing is very expensive, I'd want it to consume all the data that is currently available. On the other side, the processing must start immediately if a new data becomes available. In the nutshell, I want to do something like this

data_generator_that_never_guits | while read_all_available data 
do
 an_expensive_process_with data
done

where the command read_all_available will wait if no data is available for consumption or copy all the currently available data to the variable. It is perfectly fine if the data does not consist of full lines. Basically, I am looking for an analog of read which would read the entire pipe buffer instead of reading just a single line from the pipe.

For the curious among you, the background of the question that I have a build script which needs to trigger a rebuild on a source file change. I want to avoid triggering rebuilds too often. Please do not suggest me to use grunt, gulp or other available build systems, they do not work well for my purpose.

Thanks!

like image 305
MrMobster Avatar asked Sep 24 '14 14:09

MrMobster


People also ask

Is Linux pipe asynchronous?

Pipes provide asynchronous execution of commands using buffered I/O routines. Thus, all the commands in the pipeline operate in parallel, each in its own process.

Can you pipe in a bash script?

A pipe in Bash takes the standard output of one process and passes it as standard input into another process. Bash scripts support positional arguments that can be passed in at the command line.

Is bash script asynchronous?

Bash isn't really asynchronous in the same way that JavaScript is asynchronous, however it can produce a result that would be similar to an asynchronous command in another language by forking.

What is $s in bash?

From man bash : -s If the -s option is present, or if no arguments remain after option processing, then commands are read from the standard input. This option allows the positional parameters to be set when invoking an interactive shell.


1 Answers

I think I have found the solution after I got better insight how subshells work. This script appears to do what I need:

data_generator_that_never_guits | while true 
do
 # wait until next element becomes available
 read LINE
 # consume any remaining elements — a small timeout ensures that 
 # rapidly fired events are batched together
 while read -t 1 LINE; do true; done
 # the data buffer is empty, launch the process
 an_expensive_process
done

It would be possible to collect all the read lines to a single batch, but I don't really care about their contents at this point, so I didn't bother figuring that part out :)

Added on 25.09.2014

Here is a final subroutine, in case it could be useful for someone one day:

flushpipe() {
 # wait until the next line becomes available
 read -d "" buffer
 # consume any remaining elements — a small timeout ensures that 
  # rapidly fired events are batched together
 while read -d "" -t 1 line; do buffer="$buffer\n$line"; done
 echo $buffer   
}

To be used like this:

data_generator_that_never_guits | while true 
do
 # wait until data becomes available
 data=$(flushpipe)
 # the data buffer is empty, launch the process
 an_expensive_process_with data
done
like image 95
MrMobster Avatar answered Oct 21 '22 13:10

MrMobster