As it is, there are many approaches to reading a file into a string. Two common ones are using ifstream::read to read directly to a string and using steambuf_iterators along with std::copy_n:
Using ifstream::read:
std::ifstream in {"./filename.txt"};
std::string contents;
in.seekg(0, in.end);
contents.resize(in.tellg());
in.seekg(0, in.beg);
in.read(&contents[0], contents.size());
Using std::copy_n:
std::ifstream in {"./filename.txt"};
std::string contents;
in.seekg(0, in.end);
contents.resize(in.tellg());
in.seekg(0, in.beg);
std::copy_n(std::streambuf_iterator<char>(in),
contents.size(),
contents.begin();
Many benchmarks show that the first approach is much faster than the second one (in my machine using g++-4.9 it is about 10 times faster with both -O2 and -O3 flags) and I was wondering what may be the reason for this difference in performance.
read
is a single iostream setup (part of every iostream operation) and a single call to the OS, reading directly into the buffer you provided.
The iterator works by repeatedly extracting a single char
with operator>>
. Because of the buffer size, this might mean more OS calls, but more importantly it also means repeated setting up and tearing down of the iostream sentry, which might mean a mutex lock, and usually means a bunch of other stuff. Furthermore, operator>>
is a formatted operation, whereas read
is unformatted, which is additional setup overhead on every operation.
Edit: Tired eyes saw istream_iterator instead of istreambuf_iterator. Of course istreambuf_iterator does not do formatted input. It calls sbumpc or something like that on the streambuf. Still a lot of calls, and using the buffer, which is probably smaller than the entire file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With