I just started today looking into Groovy. I consider using it to replace some of my more complex bash scripts.
One of its very interesting concepts for me is the possibility to use pipes easily:
proc1 = 'ls'.execute()
proc2 = 'tr -d o'.execute()
proc3 = 'tr -d e'.execute()
proc4 = 'tr -d i'.execute()
proc1 | proc2 | proc3 | proc4
proc4.waitFor()
That's amazing. But my question is: Does this use real UNIX pipes (when run e.g. on Linux), or is this just a simulation with Java streams? (And if so, is it much slower/more inefficient?)
But there are two types of pipes that you can use when working on a Unix system – regular, unnamed or anonymous pipes and named pipes. These two types of pipes share some advantages, but are used and implemented very differently.
Pipe is used to combine two or more commands, and in this, the output of one command acts as input to another command, and this command's output may act as input to the next command and so on. It can also be visualized as a temporary connection between two or more commands/ programs/ processes.
Pipe may be the most useful tool in your shell scripting toolbox. It is one of the most used, but also, one of the most misunderstood. As a result, it is often overused or misused. This should help you use a pipe correctly and hopefully make your shell scripts much faster and more efficient.
The vertical bar, | , between the two commands is called a pipe. It tells the shell that we want to use the output of the command on the left as the input to the command on the right.
Due to operator overloading, it eventually calls into ProcessGroovyMethods.pipeTo() in the Groovy runtime, which does indeed simulate pipes using java streams:
/**
* Allows one Process to asynchronously pipe data to another Process.
*
* @param left a Process instance
* @param right a Process to pipe output to
* @return the second Process to allow chaining
* @throws java.io.IOException if an IOException occurs.
* @since 1.5.2
*/
public static Process pipeTo(final Process left, final Process right) throws IOException {
new Thread(new Runnable() {
public void run() {
InputStream in = new BufferedInputStream(getIn(left));
OutputStream out = new BufferedOutputStream(getOut(right));
byte[] buf = new byte[8192];
int next;
try {
while ((next = in.read(buf)) != -1) {
out.write(buf, 0, next);
}
} catch (IOException e) {
throw new GroovyRuntimeException("exception while reading process stream", e);
} finally {
closeWithWarning(out);
}
}
}).start();
return right;
}
I can't speak to the amount of overhead involved off the top of my head though.
I found groovy pipes simulation much slower than unix pipes:
Bash command
zcat dump.sql.gz | mysql -u${mysql_user} --password=${mysql_password} -D${db_name}
Takes about 40 min
Same thing with groovy
def proc1 = ["zcat", "${sqlGzFile.getPath()}"].execute()
def proc2 = ["mysql", "-u${mysqlUser}", "--password=${mysqlPassword}", "-D$dbName"].execute()
proc1 | proc2
proc2.waitFor()
takes about 2 hours and 40 minutes
But you can do pipes with:
def proc = ["sh", "-c", "zcat dump.sql.gz | mysql -u${mysql_user} --password=${mysql_password} -D${db_name}"].execute()
proc.waitFor()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With