Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is ifstream::read much faster than using iterators?

As it is, there are many approaches to reading a file into a string. Two common ones are using ifstream::read to read directly to a string and using steambuf_iterators along with std::copy_n:

Using ifstream::read:

std::ifstream in {"./filename.txt"};
std::string contents;
in.seekg(0, in.end);
contents.resize(in.tellg());
in.seekg(0, in.beg);
in.read(&contents[0], contents.size());

Using std::copy_n:

std::ifstream in {"./filename.txt"};
std::string contents;
in.seekg(0, in.end);
contents.resize(in.tellg());
in.seekg(0, in.beg);
std::copy_n(std::streambuf_iterator<char>(in), 
            contents.size(), 
            contents.begin();

Many benchmarks show that the first approach is much faster than the second one (in my machine using g++-4.9 it is about 10 times faster with both -O2 and -O3 flags) and I was wondering what may be the reason for this difference in performance.

like image 564
Veritas Avatar asked Apr 20 '15 21:04

Veritas


1 Answers

read is a single iostream setup (part of every iostream operation) and a single call to the OS, reading directly into the buffer you provided.

The iterator works by repeatedly extracting a single char with operator>>. Because of the buffer size, this might mean more OS calls, but more importantly it also means repeated setting up and tearing down of the iostream sentry, which might mean a mutex lock, and usually means a bunch of other stuff. Furthermore, operator>> is a formatted operation, whereas read is unformatted, which is additional setup overhead on every operation.

Edit: Tired eyes saw istream_iterator instead of istreambuf_iterator. Of course istreambuf_iterator does not do formatted input. It calls sbumpc or something like that on the streambuf. Still a lot of calls, and using the buffer, which is probably smaller than the entire file.

like image 133
Sebastian Redl Avatar answered Oct 11 '22 21:10

Sebastian Redl