I wanted to test the performance of writing to a file in a bash script vs a C++ program.
Here is the bash script:
#!/bin/bash
while true; do
echo "something" >> bash.txt
done
This added about 2-3 KB to the text file per second.
Here is the C++ code:
#include <iostream>
#include <fstream>
using namespace std;
int main() {
ofstream myfile;
myfile.open("cpp.txt");
while (true) {
myfile << "Writing this to a file Writing this to a file \n";
}
myfile.close();
}
This created a ~6 GB text file in less than 10 seconds.
What makes this C++ code so much faster, and/or this bash script so much slower?
C is by far the fastest of them all. BASh (Bourne Again Shell) is written in C which adds a step of translation and reduces speed. Same goes for any other shell.
Python is drastically faster on text processing, which is a common operation. If I perform the same search 10000 times on each language, on Bash it takes 1m24s, on Python 636ms. This is because Bash use a sub-process for each operation of the text processing, which is slow to create.
Both shells are 2–30 times faster than bash depending on the test.
Shell loops are slow and bash's are the slowest. Shells aren't meant to do heavy work in loops. Shells are meant to launch a few external, optimized processes on batches of data.
There are several reasons to it.
First off, interpreted execution environments (like bash
, perl
alongside with non-JITed lua
and python
etc.) are generally much slower than even poorly written compiled programs (C
, C++
, etc.).
Secondly, note how fragmented your bash code is - it just writes a line to a file, then it writes one more, and so on. Your C++ program, on the other side, performs buffered write - even without your direct efforts to it. You might see how slower will it run if you substitute
myfile << "Writing this to a file Writing this to a file \n";
with
myfile << "Writing this to a file Writing this to a file" << endl;
for more information about how streams are implemented in C++, and why \n
is different from endl
, see any reference documentation on C++.
Thirdly, as comments prove, your bash script performs open/close of the target file for each line. This implies a significant performance overhead in itself - imagine myfile.open
and myfile.close
moved inside your loop body!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With