std::accumulate C++20 version

Question

I'm trying to understand this code but I can't figure out why this version

for (; first != last; ++first) 
    init = std::move(init) + *first;

is faster than this

for (; first != last; ++first)
    init += *first;

I did take them from std::accumulate. The assembly code of the first version is longer than the second one. Even if the first version create an rvalue ref of init, it always create a temp value by adding *first and then assign it to init, that is the same process in second case where it create a temp value and then assign it to init. So, why using std::move is better than "append value" with the += operator?

EDIT

I was looking at the code of C++20 version of accumulate, and they say that before C++20 accumulate was this

template<class InputIt, class T>
T accumulate(InputIt first, InputIt last, T init)
{
    for (; first != last; ++first) {
        init = init + *first;
    }
    return init;
}

and after C++20 it become

template<class InputIt, class T>
constexpr // since C++20
T accumulate(InputIt first, InputIt last, T init)
{
    for (; first != last; ++first) {
        init = std::move(init) + *first; // std::move since C++20
    }
    return init;
}

I've just wanted to know, if by using std::move there was any real improvement or not.

EDIT2

Ok, here is my example code:

#include <utility>
#include <chrono>
#include <iostream>

using ck = std::chrono::high_resolution_clock;

std::string
test_no_move(std::string str) {

    std::string b = "t";
    int count = 0;

    while (++count < 100000)
        str = std::move(str) + b;   // Without std::move

    return str;
}

std::string
test_with_move(std::string str) {

    std::string b = "t";
    int count = 0;

    while (++count < 100000)        // With std::move
        str = str + b;

    return str;

}

int main()
{
    std::string result;
    auto start = ck::now();
    result = test_no_move("test");
    auto finish = ck::now();

    std::cout << "Test without std::move " << std::chrono::duration_cast<std::chrono::microseconds>(finish - start).count() << std::endl;

    start = ck::now();
    result = test_with_move("test");
    finish = ck::now();

    std::cout << "Test with std::move " << std::chrono::duration_cast<std::chrono::microseconds>(finish - start).count() << std::endl;

    return 0;
}

If you run it you notice that the std::move version is really faster than the other one, but if you try it using built-in types you get the std::move version slower than the other one.

So my question was, since this situation is probably the same of std::accumulate, why do they say the C++20 accumulate version with std::move is faster than the version without it? Why using std::move with something like strings I get an improvement like that, but not using something like int? Why all of this, if in both of cases, the program create a temporary string str + b (or std::move(str) + b) and then move to str? I mean, it is the same operation. Why is the second faster?

Thanks for patience. Hope I made myself clear this time.

Evg · Accepted Answer

It is potentially faster for types with non-trivial move semantics. Consider accumulation of std::vector<std::string> of long enough strings:

std::vector<std::string> strings(100, std::string(100, ' '));

std::string init;
init.reserve(10000);
auto r = accumulate(strings.begin(), strings.end(), std::move(init));

For accumulate without std::move,

std::string operator+(const std::string&, const std::string&);

will be used. At each iteration it will allocate storage on heap for the resulting string just to throw it away at the next iteration.

For accumulate with std::move,

std::string operator+(std::string&&, const std::string&);

will be used. In contrast to the previous case, the buffer of the first argument can be reused. If the initial string has enough capacity, no additional memory will be allocated during accumulation.

Simple demo

without std::move
n_allocs = 199

with std::move
n_allocs = 0

For built-in types like int, move is just a copy - there is nothing to move. For an optimized build, most likely you'll get exactly the same assembly code. If your benchmarking shows any speed improvement/degradation, most likely you're not doing it correctly (no optimization, noise, code optimized out, etc.).

std::accumulate C++20 version

Tags:

c++

std

c++11

c++20

accumulate

Sam

1 Answers

Evg

Recent Activity

Donate For Us

std::accumulate C++20 version

Tags:

c++

std

c++11

c++20

accumulate

Sam

1 Answers

Evg

Related questions

Recent Activity

Donate For Us