I have different memory allocators in my code: One for CUDA (managed or not), one for pure host memory. I could also imagine a situation when you want to use different allocation algorithms - one for large, long living blocks for example and another one for short living, small objects.
I wonder how to implement such a system properly.
Placement new?
My current solution uses placement new, where the pointer decides which memory and memory allocator to use. Care must then be taken when deleting/de-allocating the objects. Currently, it works, but I think it's not a nice solution.
MyObj* cudaObj = new(allocateCudaMemoryField(sizeof(MyObj)) MyObj(arg1, arg2);
MyObj* hostObj = new(allocateHostMemoryField(sizeof(MyObj)) MyObj(arg1, arg2);
Overload new, but how?
I'd like to go for a solution with an overloaded new
operator. Something that will look as follows:
MyObj* cudaObj = CudaAllocator::new MyObj(arg1, arg2);
MyObj* hostObj = HostAllocator::new MyObj(arg1, arg2);
CudaAllocator::delete cudaObj;
HostAllocator::delete hostObj;
I think I could achieve this by having a namespace CudaAllocator
and HostAllocator
, each with an overloaded new
and delete
.
Two questions:
new
in a code or is
this a sign for a design flaw?There is a time and place for overloading operator new
/delete
, but it is generally preferred only when simpler measures have been exhausted.
The main disadvantage of placement new
is that it requires the caller to "remember" how the object was allocated and take the appropriate action to invoke the corresponding de-allocation when that object has reached the end of its lifespan. Additionally, requiring the caller to invoke placement new
is syntactically burdensome (I presume this is the "not a nice solution" you mention.)
The main disadvantage to overloading new
/delete
is that it is meant to be done once for a given type (as @JSF pointed out). This tightly couples an object to the way it is allocated/deallocated.
Presuming this set up:
#include <memory>
#include <iostream>
void* allocateCudaMemoryField(size_t size)
{
std::cout << "allocateCudaMemoryField" << std::endl;
return new char[size]; // simulated
}
void* allocateHostMemoryField(size_t size)
{
std::cout << "allocateHostMemoryField" << std::endl;
return new char[size];
}
void deallocateCudaMemoryField(void* ptr, size_t)
{
std::cout << "deallocateCudaMemoryField" << std::endl;
delete ptr; // simulated
}
void deallocateHostMemoryField(void* ptr, size_t)
{
std::cout << "deallocateHostMemoryField" << std::endl;
delete ptr;
}
Here's MyObj
with overloaded new
/delete
(your question):
struct MyObj
{
MyObj(int arg1, int arg2)
{
cout << "MyObj()" << endl;
}
~MyObj()
{
cout << "~MyObj()" << endl;
}
static void* operator new(size_t)
{
cout << "MyObj::new" << endl;
return ::operator new(sizeof(MyObj));
}
static void operator delete(void* ptr)
{
cout << "MyObj::delete" << endl;
::operator delete(ptr);
}
};
MyObj* const ptr = new MyObj(1, 2);
delete ptr;
Prints the following:
MyObj::new
MyObj()
~MyObj()
MyObj::delete
A better solution might be to use RAII pointer types combined with a factory to hide the details of allocation and deallocation from the caller. This solution uses placement new
, but handles deallocation by attaching a deleter callback method to a unique_ptr
.
class MyObjFactory
{
public:
static auto MakeCudaObj(int arg1, int arg2)
{
constexpr const size_t size = sizeof(MyObj);
MyObj* const ptr = new (allocateCudaMemoryField(size)) MyObj(arg1, arg2);
return std::unique_ptr <MyObj, decltype(&deallocateCudaObj)> (ptr, deallocateCudaObj);
}
static auto MakeHostObj(int arg1, int arg2)
{
constexpr const size_t size = sizeof(MyObj);
MyObj* const ptr = new (allocateHostMemoryField(size)) MyObj(arg1, arg2);
return std::unique_ptr <MyObj, decltype(&deallocateHostObj)> (ptr, deallocateHostObj);
}
private:
static void deallocateCudaObj(MyObj* ptr) noexcept
{
ptr->~MyObj();
deallocateCudaMemoryField(ptr, sizeof(MyObj));
}
static void deallocateHostObj(MyObj* ptr) noexcept
{
ptr->~MyObj();
deallocateHostMemoryField(ptr, sizeof(MyObj));
}
};
{
auto objCuda = MyObjFactory::MakeCudaObj(1, 2);
auto objHost = MyObjFactory::MakeHostObj(1, 2);
}
Prints:
allocateCudaMemoryField
MyObj()
allocateHostMemoryField
MyObj()
~MyObj()
deallocateHostMemoryField
~MyObj()
deallocateCudaMemoryField
This gets better. With this same strategy, we can handle the allocation/deallocation semantics for any class.
class Factory
{
public:
// Generic versions that don't care what kind object is being allocated
template <class T, class... Args>
static auto MakeCuda(Args... args)
{
constexpr const size_t size = sizeof(T);
T* const ptr = new (allocateCudaMemoryField(size)) T(args...);
using Deleter = void(*)(T*);
using Ptr = std::unique_ptr <T, Deleter>;
return Ptr(ptr, deallocateCuda <T>);
}
template <class T, class... Args>
static auto MakeHost(Args... args)
{
constexpr const size_t size = sizeof(T);
T* const ptr = new (allocateHostMemoryField(size)) T(args...);
using Deleter = void(*)(T*);
using Ptr = std::unique_ptr <T, Deleter>;
return Ptr(ptr, deallocateHost <T>);
}
private:
template <class T>
static void deallocateCuda(T* ptr) noexcept
{
ptr->~T();
deallocateCudaMemoryField(ptr, sizeof(T));
}
template <class T>
static void deallocateHost(T* ptr) noexcept
{
ptr->~T();
deallocateHostMemoryField(ptr, sizeof(T));
}
};
Used with a new class S:
struct S
{
S(int x, int y, int z) : x(x), y(y), z(z)
{
cout << "S()" << endl;
}
~S()
{
cout << "~S()" << endl;
}
int x, y, z;
};
{
auto objCuda = Factory::MakeCuda <S>(1, 2, 3);
auto objHost = Factory::MakeHost <S>(1, 2, 3);
}
Prints:
allocateCudaMemoryField
S()
allocateHostMemoryField
S()
~S()
deallocateHostMemoryField
~S()
deallocateCudaMemoryField
I didn't want to crank the templating full blast, but obviously that code is ripe for DRYing out (parameterize the implementations on allocator function).
This works out pretty well when your objects are relatively large and not allocated/deallocated too frequently. I wouldn't use this if you have millions of objects coming and going every second.
Some of the same strategies work, but you want to also consider tactics like
vector
It really depends on your needs.
No. Don't overload new
/delete
in this situation. Build an allocator that delegates to your generic memory allocators.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With