I have a bash script like this
data_generator_that_never_guits | while read data
do
an_expensive_process_with data
done
The first process continuously generates events (at irregular intervals) which needs to be processed as they become available. A problem with this script is that read on consumes a single line of the output; and as the processing is very expensive, I'd want it to consume all the data that is currently available. On the other side, the processing must start immediately if a new data becomes available. In the nutshell, I want to do something like this
data_generator_that_never_guits | while read_all_available data
do
an_expensive_process_with data
done
where the command read_all_available will wait if no data is available for consumption or copy all the currently available data to the variable. It is perfectly fine if the data does not consist of full lines. Basically, I am looking for an analog of read which would read the entire pipe buffer instead of reading just a single line from the pipe.
For the curious among you, the background of the question that I have a build script which needs to trigger a rebuild on a source file change. I want to avoid triggering rebuilds too often. Please do not suggest me to use grunt, gulp or other available build systems, they do not work well for my purpose.
Thanks!
Pipes provide asynchronous execution of commands using buffered I/O routines. Thus, all the commands in the pipeline operate in parallel, each in its own process.
A pipe in Bash takes the standard output of one process and passes it as standard input into another process. Bash scripts support positional arguments that can be passed in at the command line.
Bash isn't really asynchronous in the same way that JavaScript is asynchronous, however it can produce a result that would be similar to an asynchronous command in another language by forking.
From man bash : -s If the -s option is present, or if no arguments remain after option processing, then commands are read from the standard input. This option allows the positional parameters to be set when invoking an interactive shell.
I think I have found the solution after I got better insight how subshells work. This script appears to do what I need:
data_generator_that_never_guits | while true
do
# wait until next element becomes available
read LINE
# consume any remaining elements — a small timeout ensures that
# rapidly fired events are batched together
while read -t 1 LINE; do true; done
# the data buffer is empty, launch the process
an_expensive_process
done
It would be possible to collect all the read lines to a single batch, but I don't really care about their contents at this point, so I didn't bother figuring that part out :)
Added on 25.09.2014
Here is a final subroutine, in case it could be useful for someone one day:
flushpipe() {
# wait until the next line becomes available
read -d "" buffer
# consume any remaining elements — a small timeout ensures that
# rapidly fired events are batched together
while read -d "" -t 1 line; do buffer="$buffer\n$line"; done
echo $buffer
}
To be used like this:
data_generator_that_never_guits | while true
do
# wait until data becomes available
data=$(flushpipe)
# the data buffer is empty, launch the process
an_expensive_process_with data
done
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With