Is it possible to leave the return value of a thrust::reduce operation in device-allocated memory? In case it is, is it just as easy as assigning the value to a cudaMalloc'ed area, or should I use a thrust::device_ptr?
Is it possible to leave the return value of a thrust::reduce operation in device-allocated memory?
The short answer is no.
thrust reduce returns a quantity, the result of the reduction. This quantity must be deposited in a host resident variable:
Take for example reduce, which is synchronous and always returns its result to the CPU:
template<typename Iterator, typename T>
T reduce(Iterator first, Iterator last, T init);
Once the result of the operation has been returned to the CPU, you can copy it to the GPU if you like:
#include <iostream>
#include <thrust/device_vector.h>
#include <thrust/reduce.h>
int main(){
thrust::device_vector<int> data(256, 1);
thrust::device_vector<int> result(1);
result[0] = thrust::reduce(data.begin(), data.end());
std::cout << "result = " << result[0] << std::endl;
return 0;
}
Another possible alternative is to use thrust::reduce_by_key
which will return the reduction result to device memory, rather than copy to host memory. If you use a single key for your entire array, the net result will be a single output, similar to thrust::reduce
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With