I want to use C++17 parallel capabilities to divide every element of a std::vector
by some constant and store the result in another std::vector
of same length and (!!) order.
E.g.
{6,9,12} / 3 = {2,3,4}
I have a not compiling example
#include <execution>
#include <algorithm>
template <typename T>
std::vector<T> & divide(std::vector<T> const & in)
{
std::vector<T> out(in.size(), 0);
float const divisor = 3;
std::for_each
( std::execution::par_unseq
, in.begin()
, in.end()
, /* divide each element by divisor and put result in out */ );
return out;
}
How can I get this running, lockless and threadsafe?
something like this:
#include <vector>
#include <algorithm>
#include <execution>
template <typename T>
std::vector<T> divide(std::vector<T> result)
{
// ^^ take a copy of the argument here - will often be elided anyway
float const divisor = 3;
// the following loop mutates distinct objects within the vector and
// invalidates no iterators. c++ guarantees that each object is distinct
// and that neighbouring objects may be updated by different threads
// at the same time without a mutex.
std::for_each(
std::execution::par,
std::begin(result),
std::end(result),
[divisor](T& val) { // copies are safer, and the resulting code will be as quick.
// modifies value in place
val /= divisor;
});
// implicit fence here. Safe to manipulate the vector as a whole.
// from here on
// return by value. Allows RVO.
return result;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With