I have a large .txt file that I need to load and store in a vector. The file is around 5MB in size, 500 000 lines, and every line is around 10-20 characters, delimited with '\n'. I was doing some benchmarking on the time needed to read the whole file using a sample code below:
#include<iostream>
#include<vector>
#include<fstream>
int main()
{
std::fstream input("words.txt");
std::vector<std::string> vector;
std::string line;
while(input >> line){
vector.push_back(line);
}
}
I was curious if passing the string as an rvalue reference would prove faster but instead it was slower by about 10ms.
#include<iostream>
#include<vector>
#include<fstream>
int main()
{
std::fstream input("words.txt");
std::vector<std::string> vector;
std::string line;
while(input >> line){
vector.push_back(std::move(line));
}
}
The average load time for the first code example was about 58ms, and for the second one 68-70ms. I was thinking that moving is always faster or equal to copying, and that's why this doesn't seem right to me.
Does anyone know the reason to why this is happening? The benchmarks were done using:
perf stats -r 100 ./a.out
on Arch Linux, and the code has been compiled using GCC 10.2, C++17 std.
Also if anyone knows a more optimal way to do this would be much appreciated.
If you invoke g++ -E
you can ogle the relevant code:
Copy construction:
basic_string(const basic_string& __str)
: _M_dataplus(_M_local_data(),
_Alloc_traits::_S_select_on_copy(__str._M_get_allocator()))
{
_M_construct(__str._M_data(), __str._M_data() + __str.length());
}
Move construction:
# 552 "/usr/local/include/c++/10.2.0/bits/basic_string.h" 3
basic_string(basic_string&& __str) noexcept
: _M_dataplus(_M_local_data(), std::move(__str._M_get_allocator()))
{
if (__str._M_is_local())
{
traits_type::copy(_M_local_buf, __str._M_local_buf,
_S_local_capacity + 1);
}
else
{
_M_data(__str._M_data());
_M_capacity(__str._M_allocated_capacity);
}
_M_length(__str.length());
__str._M_data(__str._M_local_data());
__str._M_set_length(0);
}
Notably, (to support the Short-String-Optimisation) the move constructor needs to look at ._M_is_local()
to work out whether to copy or move (so has a branch to predict), and it clears out the moved-from string / sets its length to 0. Extra work = extra time.
@Manuel made an interesting comment:
When you move line it gets empty, so the next iteration needs to allocate space. std::string has a small buffer as an optimization (most of the implementations, if not all) and the copy just copies chars, no memory allocation. That could be the difference.
That doesn't quite add up as stated, but there are subtle differences here. For input space-separated words long enough to need dynamic allocation, the:
a) move version may have cleared out line
's dynamic buffer after the last word, and therefore need to reallocate; if the input is long enough it may have to reallocate one or more times to increase capacity.
b) copy version will likely have a large-enough buffer (its capacity will be increased as necessary, effectively being a high-water-mark for the words seen), but then a dynamic allocation will be needed when copy constructing inside push_back
. The exact size of that allocation is known up front though - it won't need to be resized to grow capacity.
That does suggest copy could be faster when there's a lot of variation in the input word lengths.
Also if anyone knows a more optimal way to do this would be much appreciated.
If you really care about performance, I'd recommend benchmarking memory mapping the file and creating a vector
of string_view
s into it: that's likely to be considerably faster.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With