Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Returning values from functions when efficiency matters

Tags:

bash

It seems to me, there are several ways to return a value from a Bash function.

Approach 1: Use a "local-global" variable, which is defined as local in the caller:

func1() {
    a=10
}

parent1() {
    local a

    func1
    a=$(($a + 1))
}

Approach 2: Use command substitution:

func2() {
    echo 10
}

parent2() {
    a=$(func2)
    a=$(($a + 1))
}

How much speedup could one expect from using approach 1 over approach2?

And, I know that it is not good programming practice to use global variables like in approach 1, but could it at some point be justified due to efficiency considerations?

like image 855
Håkon Hægland Avatar asked Apr 18 '15 11:04

Håkon Hægland


1 Answers

The single most expensive operation in shell scripting is forking. Any operation involving a fork, such as command substitution, will be 1-3 orders of magnitude slower than one that doesn't.

For example, here's a straight forward approach for a loop that reads a bunch of generated files on the form of file-1234 and strips out the file- prefix using sed, requiring a total of three forks (command substitution + two stage pipeline):

$ time printf "file-%s\n" {1..10000} |
     while read line; do n=$(echo "$line" | sed -e "s/.*-//"); done

real    0m46.847s

Here's a loop that does the same thing with parameter expansion, requiring no forks:

$ time printf "file-%s\n" {1..10000} |
     while read line; do n=${line#*-}; done

real    0m0.150s

The forky version takes 300x longer.

Therefore, the answer to your question is yes: if efficiency matters, you have solid justification for factoring out or replacing forky code.

When the fork count is constant with respect to the input (or it's too messy to make it constant), and the code is still too slow, that's when you should rewrite it in a faster language.

like image 182
that other guy Avatar answered Sep 28 '22 08:09

that other guy