I have questions about making pinned memory.
Now I'm using CUDA to deal with great size of data.
To reduce run-time, I figure out it is necessary to make memory-copy and kernel-launch overlapped.
After searching some texts and web pages, to overlapping memory-copy and kernel-launch, I notice it is necessary to allocate host memory by using cudaMallocHost which will allocates host-memory to pinned memory.
In the case of using integer or array type on host, it was easy to make pinned memory.
Just like this...
cudaStream_t* streams = (cudaStream_t*)malloc(MAX_num_stream * sizeof(cudaStream_t));
for(i=0; i<MAX_num_stream; i++)
cudaStreamCreate(&(streams[i]));
cudaMallocHost(&departure, its_size);
for(n=1; ... ; n++){
cudaMemcpyAsync( ... streams[n]);
kernel <<< ... , ... , ... , streams[n] >>> (...);
}
But in my case, my host departure memory is set by vertor type.
And I can't find anywhere the way to turn vector-type-host-memory into pinned memory by using cudaMallocHost.
Help me or give some advice to solve this problem. Thanks you for reading my poor English. Thanks.
Directly, you can't allocate memory for anything other POD types using cudaMallocHost
.
If you really need a std::vector
which uses pinned memory, you will have to implement your own model of std::allocator
which calls cudaMallocHost
internally and instantiate your std::vector
using that custom allocator.
Alternatively, the thrust template library (which ships in recent releases of CUDA toolkit) includes an experimental pinned memory allocator which you could use with thrusts own vector class, which is iteself a model of std::vector
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With