I wrote multiple scripts in Perl and shell and I have compared the real execution time. In all the cases, the Perl script was more than 10 times faster than the shell script.
So I wondered if it possible to write a shell script which is faster than the same script in Perl? And why is Perl faster than shell although I use the system function in Perl script?
There are few ways to make your shell (eg Bash) execute faster.
sed
, grep
, awk
et for string/text
manipulation.command|grep|grep|cut|sed
makes your code slow. Each pipe is an overhead.
For this example, just one awk does them all.
command | awk '{do everything here}'
The closest tool you can use which can match Perl's speed for certain tasks, eg string manipulation or maths, is awk. Here's a fun benchmark for this solution. There are around 9million numbers in the fileOutput
$ head -5 file
1
2
3
34
42
$ wc -l <file
8999987
# time perl -nle '$sum += $_ } END { print $sum' file
290980117
real 0m13.532s
user 0m11.454s
sys 0m0.624s
$ time awk '{ sum += $1 } END { print sum }' file
290980117
real 0m9.271s
user 0m7.754s
sys 0m0.415s
$ time perl -nle '$sum += $_ } END { print $sum' file
290980117
real 0m13.158s
user 0m11.537s
sys 0m0.586s
$ time awk '{ sum += $1 } END { print sum }' file
290980117
real 0m9.028s
user 0m7.627s
sys 0m0.414s
For each try, awk is faster than Perl.
Lastly, try to learn awk beyond what they can do as one liners.
This might fall dangerously close to arm-chair optimization, but here are some ideas that might rationalize your results:
Fork/exec: almost anything useful that is done by a shell script is done via a shell-out, that is starting a new shell and running the a command such as sed
, awk
, cat
etc. More often then not, more then one process is executed, and data is moved via pipes.
Data structures: Perl's data structures are more sophisticated then Bash's or Csh's. This typically force the programmer to be created with data storage. This can take the forms of:
Non optimized implementation: some shell construct might not be designed with optimization in mind, but with user convenience. For example, I have reason to believe that the bash implementation of Parameter Expansion in particular ${foo//search/replace}
is sub-optimal relative to the same operation in sed
. This is typically not a problem for day-to-day tasks.
Okay, I know I'm asking for it by opening up a can of worms closed two years ago, but I'm not 100% happy with any of the answers.
The right answer is YES. But most new coders will still go to Perl and Python and write code that struggles mightily to WRAP CALLS TO EXTERNAL EXECUTABLES because they lack the mentoring or experience required to know when to use which tools.
The Korn Shell (ksh) has fast builtin math, and a fully capable and speedy regex engine that, gasp, can handle Perl type regex. It also has associative arrays. It can even load external .so libraries. And it was a finished and mature product 10 years ago. It's even already installed on your Mac.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With