It seems when creating a new Thrust vector all elements are 0 by default - I just want to confirm that this will always be the case.
If so, is there also a way to bypass the constructor responsible for this behavior for additional speed (since for some vectors I don't need them to have an initial value, e.g. if their raw pointers are being passed to CUBLAS as an output)?
thrust::device_vector
constructs the elements it contains using its supplied allocator, just like std::vector
. It's possible to control what the allocator does when the vector asks it to construct an element.
Use a custom allocator to avoid default-initialization of vector elements:
// uninitialized_allocator is an allocator which
// derives from device_allocator and which has a
// no-op construct member function
template<typename T>
struct uninitialized_allocator
: thrust::device_malloc_allocator<T>
{
// note that construct is annotated as
// a __host__ __device__ function
__host__ __device__
void construct(T *p)
{
// no-op
}
};
// to make a device_vector which does not initialize its elements,
// use uninitialized_allocator as the 2nd template parameter
typedef thrust::device_vector<float, uninitialized_allocator<float> > uninitialized_vector;
You will still incur the cost of a kernel launch to invoke uninitialized_allocator::construct
, but that kernel will be a no-op which will retire quickly. What you're really interested in is avoiding the memory bandwidth required to fill the array, which this solution does.
There's a complete example code here.
Note that this technique requires Thrust 1.7 or better.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With