Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance Difference Between C and C++ Style File IO

I've always heard that C++ file I/O operations are much much slower than C style I/O. But I didn't find any practical references on comparatively how slow they actually are, so I decided to test it in my machine (Ubuntu 12.04, GCC 4.6.3, ext4 partition format).

First I wrote a ~900MB file in the disk.

C++ (ofstream): 163s

ofstream file("test.txt");
    
for(register int i = 0; i < 100000000; i++) 
    file << i << endl;

C (fprintf): 12s

FILE *fp = fopen("test.txt", "w");
    
for(register int i = 0; i < 100000000; i++) 
    fprintf(fp, "%d\n", i);

I was expecting such output, it shows that writing to a file is much slower in C++ than in C. Then I read the same file using C and C++ I/O. What made me exclaimed that there is almost no difference in performance while reading from file.

C++ (ifstream): 12s

int n;
ifstream file("test.txt");

for(register int i = 0; i < 100000000; i++) 
    file >> n;

C (fscanf): 12s

FILE *fp = fopen("test.txt", "r");
    
for(register int i = 0; i < 100000000; i++) 
    fscanf(fp, "%d", &n);

So, why is taking so long to execute writing using stream? Or, why reading using stream is so fast compared to writing?

Conclusion: The culprit is the std::endl, as the answers and the comments have pointed out. Changing the line file << i << endl; to file << i << '\n'; has reduced running time to 16s from 163s.

like image 205
Rafi Kamal Avatar asked Jul 04 '13 10:07

Rafi Kamal


3 Answers

You're using endl to print a newline. That is the problem here, as it does more than just printing a newline — endl also flushes the buffer which is an expensive operation (if you do that in each iteration).

Use \n if you mean so:

file << i << '\n'; 

And also, must compile your code in release mode (i.e turn on the optimizations).

like image 198
Nawaz Avatar answered Sep 29 '22 23:09

Nawaz


No, C++ input/output is not substantially slower than C’s – if anything, a modern implementation should be slightly faster on formatted input/output since it doesn’t need to parse a format string, and the formatting is instead determined at compile time through the chaining of the stream operators.

Here are a few caveats to consider in a benchmark:

  • Compile with full optimisations (-O3) to get a fair comparison.
  • A proper benchmark needs to estimate biases – in practice this means that you need to repeat your tests and interleave them. At the moment your code isn’t robust to disturbances from background processes. You should also report a summary statistic of the repeated runs to catch outliers that distort the estimates.
  • Disable C++ stream synchronisation with C streams (std::ios_base::sync_with_stdio(false);)
  • Use '\n' instead of the (flushing) std::endl
  • Don’t use register declarations – it simply makes no difference and modern compilers probably ignore it anyway.
like image 39
Konrad Rudolph Avatar answered Sep 29 '22 23:09

Konrad Rudolph


When working with large files with fstream, make sure to set a stream buffer >0.

Counterintuitively, disabling stream buffering dramatically reduces performance. At least the MSVC 2015 implementation copies 1 char at a time to the filebuf when no buffer was set (see streambuf::xsputn), which can make your application CPU-bound, which will result in lower I/O rates.

const size_t bufsize = 256*1024;
char buf[bufsize];
mystream.rdbuf()->pubsetbuf(buf, bufsize);

You can find a complete sample application here.

like image 28
rustyx Avatar answered Sep 30 '22 00:09

rustyx