This is a quote from cppreference.com for std::copy(https://en.cppreference.com/w/cpp/algorithm/copy).
Complexity
1-2) Exactly (last - first) assignments
3-4) Exactly (last - first) applications of the predicate, between 0 and (last - first) assignments (assignment for every element for which predicate is equal to true, dependent on predicate and input data)
For the overloads with an ExecutionPolicy, there may be a performance cost if ForwardIt1's value type is not MoveConstructible.
It is clear that parallel version of std::copy should do additional work to organize parallelism and its complexity may be increased. I want to understand how much higher it could be and when it could happen.
If value type is not MoveConstructible it also means that it is not CopyConstuctible, right? Then how we can copy that kind of objects? Can somebody provider an example where we get performance penalty because of it.
If value type is not MoveConstructible it also means that it is not CopyConstuctible, right?
No, here is a counter-example, which if copying is expensive could incur some performance penalty due to the inability to move:
struct S
{
    S();
    S(const S&);
    S& operator=(const S&);
    S(S&&) = delete;
    S& operator=(S&&) = delete;
};
Notice that OutputIt is a unidirectional iterator, not random access. For parallel copying to achieve a performance benefit, the copying must occur in a non-sequential way, but the results must be stored into OutputIt sequentially. That's why the objects should be movable, to take them whence multiple threads create them and write them sequentially to OutputIt regardless of the ordering of assignment.
In truth it is not enough for the objects to be movable, they must be cheaply movable to avoid extra runtime cost.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With