Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are std::fstreams so slow?

I was working on a simple parser and when profiling I observed the bottleneck is in... file read! I extracted very simple test to compare the performance of fstreams and FILE* when reading a big blob of data:

#include <stdio.h> #include <chrono> #include <fstream> #include <iostream> #include <functional>  void measure(const std::string& test, std::function<void()> function) {     auto start_time = std::chrono::high_resolution_clock::now();      function();      auto duration = std::chrono::duration_cast<std::chrono::nanoseconds>(std::chrono::high_resolution_clock::now() - start_time);     std::cout<<test<<" "<<static_cast<double>(duration.count()) * 0.000001<<" ms"<<std::endl; }  #define BUFFER_SIZE (1024 * 1024 * 1024)  int main(int argc, const char * argv[]) {     auto buffer = new char[BUFFER_SIZE];     memset(buffer, 123, BUFFER_SIZE);      measure("FILE* write", [buffer]()     {         FILE* file = fopen("test_file_write", "wb");         fwrite(buffer, 1, BUFFER_SIZE, file);         fclose(file);     });     measure("FILE* read", [buffer]()     {         FILE* file = fopen("test_file_read", "rb");         fread(buffer, 1, BUFFER_SIZE, file);         fclose(file);     });     measure("fstream write", [buffer]()     {         std::ofstream stream("test_stream_write", std::ios::binary);         stream.write(buffer, BUFFER_SIZE);     });     measure("fstream read", [buffer]()     {         std::ifstream stream("test_stream_read", std::ios::binary);         stream.read(buffer, BUFFER_SIZE);     });      delete[] buffer; } 

The results of running this code on my machine are:

FILE* write 1388.59 ms FILE* read 1292.51 ms fstream write 3105.38 ms fstream read 3319.82 ms 

fstream write/read are about 2 times slower than FILE* write/read! And this while reading a big blob of data, without any parsing or other features of fstreams. I'm running the code on Mac OS, Intel I7 2.6GHz, 16GB 1600 MHz Ram, SSD drive. Please note that running again same code the time for FILE* read is very low (about 200 ms) probably because the file gets cached... This is why the files opened for reading are not created using the code.

Why when reading just a blob of binary data using fstream is so slow compared to FILE*?

EDIT 1: I updated the code and the times. Sorry for the delay!

EDIT 2: I added command line and new results (very similar to previous ones!)

$ clang++  main.cpp -std=c++11 -stdlib=libc++ -O3 $ ./a.out FILE* write 1417.9 ms FILE* read 1292.59 ms fstream write 3214.02 ms fstream read 3052.56 ms 

Following the results for the second run:

$ ./a.out FILE* write 1428.98 ms FILE* read 196.902 ms fstream write 3343.69 ms fstream read 2285.93 ms 

It looks like the file gets cached when reading for both FILE* and stream as the time reduces with the same amount for both of them.

EDIT 3: I reduced the code to this:

FILE* file = fopen("test_file_write", "wb"); fwrite(buffer, 1, BUFFER_SIZE, file); fclose(file);  std::ofstream stream("test_stream_write", std::ios::binary); stream.write(buffer, BUFFER_SIZE); 

And started the profiler. It seems like stream spends lots of time in xsputn function, and the actual write calls have the same duration (as it should be, it's the same function...)

Running    Time     Self       Symbol Name 3266.0ms   66.9%    0,0        std::__1::basic_ostream<char, std::__1::char_traits<char> >::write(char const*, long) 3265.0ms   66.9%    2145,0          std::__1::basic_streambuf<char, std::__1::char_traits<char> >::xsputn(char const*, long) 1120.0ms   22.9%    7,0                 std::__1::basic_filebuf<char, std::__1::char_traits<char> >::overflow(int) 1112.0ms   22.7%    2,0                      fwrite 1127.0ms   23.0%    0,0        fwrite 

EDIT 4 For some reason this question is marked as duplicate. I wanted to point out that I don't use printf at all, I use only std::cout to write the time. The files used in the read part are the output from the write part, copied with different name to avoid caching

like image 761
Mircea Ispas Avatar asked Sep 29 '14 08:09

Mircea Ispas


1 Answers

It would seem that, on Linux, for this large set of data, the implementation of fwrite is much more efficient, since it uses write rather than writev.

I'm not sure WHY writev is so much slower than write, but that appears to be where the difference is. And I see absolutely no real reason as to why the fstream needs to use that construct in this case.

This can easily be seen by using strace ./a.out (where a.out is the program testing this).

Output:

Fstream:

clock_gettime(CLOCK_REALTIME, {1411978373, 114560081}) = 0 open("test", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3 writev(3, [{NULL, 0}, {"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1073741824}], 2) = 1073741824 close(3)                                = 0 clock_gettime(CLOCK_REALTIME, {1411978386, 376353883}) = 0 write(1, "fstream write 13261.8 ms\n", 25fstream write 13261.8 ms) = 25 

FILE*:

clock_gettime(CLOCK_REALTIME, {1411978386, 930326134}) = 0 open("test", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1073741824) = 1073741824 clock_gettime(CLOCK_REALTIME, {1411978388, 584197782}) = 0 write(1, "FILE* write 1653.87 ms\n", 23FILE* write 1653.87 ms) = 23 

I don't have them fancy SSD drives, so my machine will be a bit slower on that - or something else is slower in my case.

As pointed out by Jan Hudec, I'm misinterpreting the results. I just wrote this:

#include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <sys/uio.h> #include <unistd.h> #include <iostream> #include <cstdlib> #include <cstring> #include <functional> #include <chrono>  void measure(const std::string& test, std::function<void()> function) {     auto start_time = std::chrono::high_resolution_clock::now();      function();      auto duration = std::chrono::duration_cast<std::chrono::nanoseconds>(std::chrono::high_resolution_clock::now() - start_time);     std::cout<<test<<" "<<static_cast<double>(duration.count()) * 0.000001<<" ms"<<std::endl; }  #define BUFFER_SIZE (1024 * 1024 * 1024)   int main() {     auto buffer = new char[BUFFER_SIZE];     memset(buffer, 0, BUFFER_SIZE);      measure("writev", [buffer]()     {         int fd = open("test", O_CREAT|O_WRONLY);         struct iovec vec[] =          {             { NULL, 0 },             { (void *)buffer, BUFFER_SIZE }         };         writev(fd, vec, sizeof(vec)/sizeof(vec[0]));         close(fd);     });      measure("write", [buffer]()     {         int fd = open("test", O_CREAT|O_WRONLY);         write(fd, buffer, BUFFER_SIZE);         close(fd);     }); } 

It is the actual fstream implementation that does something daft - probably copying the whole data in small chunks, somewhere and somehow, or something like that. I will try to find out further.

And the result is pretty much identical for both cases, and faster than both fstream and FILE* variants in the question.

Edit:

It would seem like, on my machine, right now, if you add fclose(file) after the write, it takes approximately the same amount of time for both fstream and FILE* - on my system, around 13 seconds to write 1GB - with old style spinning disk type drives, not SSD.

I can however write MUCH faster using this code:

#include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <sys/uio.h> #include <unistd.h> #include <iostream> #include <cstdlib> #include <cstring> #include <functional> #include <chrono>  void measure(const std::string& test, std::function<void()> function) {     auto start_time = std::chrono::high_resolution_clock::now();      function();      auto duration = std::chrono::duration_cast<std::chrono::nanoseconds>(std::chrono::high_resolution_clock::now() - start_time);     std::cout<<test<<" "<<static_cast<double>(duration.count()) * 0.000001<<" ms"<<std::endl; }  #define BUFFER_SIZE (1024 * 1024 * 1024)   int main() {     auto buffer = new char[BUFFER_SIZE];     memset(buffer, 0, BUFFER_SIZE);      measure("writev", [buffer]()     {         int fd = open("test", O_CREAT|O_WRONLY, 0660);         struct iovec vec[] =          {             { NULL, 0 },             { (void *)buffer, BUFFER_SIZE }         };         writev(fd, vec, sizeof(vec)/sizeof(vec[0]));         close(fd);     });      measure("write", [buffer]()     {         int fd = open("test", O_CREAT|O_WRONLY, 0660);         write(fd, buffer, BUFFER_SIZE);         close(fd);     }); } 

gives times of about 650-900 ms.

I can also edit the original program to give a time of approximately 1000ms for fwrite - simply remove the fclose.

I also added this method:

measure("fstream write (new)", [buffer]() {     std::ofstream* stream = new std::ofstream("test", std::ios::binary);     stream->write(buffer, BUFFER_SIZE);     // Intentionally no delete. }); 

and then it takes about 1000 ms here too.

So, my conclusion is that, somehow, sometimes, closing the file makes it flush to disk. In other cases, it doesn't. I still don't understand why...

like image 144
Mats Petersson Avatar answered Oct 07 '22 20:10

Mats Petersson