Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How expensive is a bash function call really?

Tags:

bash

Just out if interest, are there any sources on how expensive function calls in Bash really are? I expect them to be several times slower than executing the code within them directly but I can't seem to find anything about this.

like image 338
helpermethod Avatar asked Dec 10 '12 13:12

helpermethod


People also ask

Should I use functions in bash?

Functions in Bash Scripting are a great way to reuse code. In this section of our Bash scripting tutorial you'll learn how they work and what you can do with them. Think of a function as a small script within a script. It's a small chunk of code which you may call multiple times within your script.

What does $$ mean in bash?

$$ is the process ID of the current shell instance. So in your case the number, 23019, is the PID of that instance of bash . The following should give you a better idea: ps -p $$ Copy link CC BY-SA 3.0.

Can a bash function call itself?

Yes, but you'll get better results if you don't misspell echo (and users will be less unhappy if you don't misspell question , etc).

Is shell script slow?

Another significant disadvantage is the slow execution speed and the need to launch a new process for almost every shell command executed.


1 Answers

I don't really agree that performance should not be a worry when programming in bash. It's actually a very good question to ask.

Here's a possible benchmark, comparing the builtin true and the command true, the full path of which is /bin/true on my machine.

On my machine:

$ time for i in {0..1000}; do true; done

real    0m0.004s
user    0m0.004s
sys 0m0.000s
$ time for i in {0..1000}; do /bin/true; done

real    0m2.660s
user    0m2.880s
sys 0m2.344s

Amazing! That's about 2 to 3 ms wasted by just forking a process (on my machine)!

So next time you have some large text file to process, you'll avoid the (retarded) long chains of piped cats, greps, awks, cuts, trs, seds, heads, tails, you-name-its. Besides, unix pipes and also very slow (will that be your next question?).

Imagine you have a 1000 lines file, and if in each line you put one cat then a grep then a sed and then an awk (no, don't laugh, you can see even worse by going through the posts on this site!), then you're already wasting (on my machine) at least 2*4*1000=8000ms=8s just forking stupid and useless processes.

Update. To answer your comment about pipes...

Subshells

Subshells are very slow:

$ time for i in {1..1000}; do (true); done

real    0m2.465s
user    0m2.812s
sys 0m2.140s

Amazing! over 2ms per subshell (on my machine).

Pipes

Pipes are also very slow (this should be obvious regarding the fact that they involve subshells):

$ time for i in {1..1000}; do true | true; done

real    0m4.769s
user    0m5.652s
sys 0m4.240s

Amazing! over 4ms per pipe (on my machine), so that's 2ms for just the pipe (subtracting the time for the subshell).

Redirection

$ time for i in {1..1000}; do true > file; done

real    0m0.014s
user    0m0.008s
sys 0m0.008s

So that's pretty fast.

Ok, you probably also want to see it in action with creation of a file:

$ rm file*; time for i in {1..1000}; do true > file$i; done

real    0m0.030s
user    0m0.008s
sys 0m0.016s

Still decently fast.

Pipes vs redirections:

In your comment, you mention:

sed '' filein > filetmp; sed '' filetmp > fileout

vs

sed '' filein | sed '' > fileout

(Of course, the best thing would be to use a single instance of sed (it's usually possible), but that doesn't answer the question.)

Let's check that out:

A funny way:

$ rm file*
$ > file
$ time for i in {1..1000}; do sed '' file | sed '' > file$i; done

real    0m5.842s
user    0m4.752s
sys 0m5.388s
$ rm file*
$ > file
$ time for i in {1..1000}; do sed '' file > filetmp$i; sed '' filetmp$i > file$i; done

real    0m6.723s
user    0m4.812s
sys 0m5.800s

So it seems faster to use a pipe rather than using a temporary file (for sed). In fact, this could have been understood without typing the lines: in a pipe, as soon as the first sed spits out something, the second sed starts processing the data. In the second case, the first sed does its job, and then the second sed does its job.

So our experiment is not a good way of determining if pipes are better that redirections.

How about process substitutions?

$ rm file*
$ > file
$ time for i in {1..1000}; do sed '' > file$i < <(sed '' file); done

real    0m7.899s
user    0m1.572s
sys 0m3.712s

Wow, that's slow! Hey, but observe the user and system cpu usage: much less than the other two possibilities (if someone can explain that...)

like image 119
gniourf_gniourf Avatar answered Nov 10 '22 14:11

gniourf_gniourf