If not, what is the standard way to free up cudaMalloc
ed memory when an exception is thrown? (Note that I am unable to use Thrust.)
Pointer arithmetic does work just fine in CUDA. You can add an offset to a CUDA pointer in host code and it will work correctly (remembering the offset isn't a byte offset, it is a plain word or element offset).
And to use smart pointers memory header file must be included as it contains the necessary class and member methods used by the smart pointers. A unique pointer is the type of smart pointer that owns an object to which it points i.e. two unique pointers can not point to the same object.
A smart pointer is an object that acts as a wrapper around the raw pointer in C++. Smart pointer like a raw pointer point to objects but is more enhanced in terms of functionality.
You can add an offset to a CUDA pointer in host code and it will work correctly (remembering the offset isn't a byte offset, it is a plain word or element offset).
You can use RAII idiom and put your cudaMalloc()
and cudaFree()
calls to the constructor and destructor of your object respectively.
Once the exception is thrown your destructor will be called which will free the allocated memory.
If you wrap this object into a smart-pointer (or make it behave like a pointer) you will get your CUDA smart-pointer.
You can use this custom cuda::shared_ptr
implementation. As mentioned above, this implementation uses std::shared_ptr
as a wrapper for CUDA device memory.
std::shared_ptr<T[]> data_host = std::shared_ptr<T[]>(new T[n]);
.
.
.
// In host code:
fun::cuda::shared_ptr<T> data_dev;
data_dev->upload(data_host.get(), n);
// In .cu file:
// data_dev.data() points to device memory which contains data_host;
This repository is indeed a single header file (cudasharedptr.h
), so it will be easy to manipulate it if is necessary for your application.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With