Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Type-erased allocators in modern C++

The "classic" STL containers such as std::vector and std::map take their allocator types as a template argument. This means that std::vector<T, std::allocator<T>> and std::vector<T, MyAllocator> for example are considered completely separate types.

Some newer allocator-aware classes like std::shared_ptr and std::tuple on the other hand use type-erasure to "hide" information about the allocator, so it does not form part of the type signature. However, std::unordered_map (which is of a similar vintage to shared_ptr) maintains the classic approach of taking an extra defaulted template parameter.

Questions:

  1. Is treating std::vector<T, std::allocator<T>> and std::vector<T, MyAllocator> as distinct types considered desirable, or is it just a side effect of type-erasure not being a well-known technique at the time the STL was written?

  2. What are the downsides (if any) of using type-erasure in this way?

  3. Should type-erased allocators always be preferred for new containers?

like image 671
Tristan Brindle Avatar asked Oct 23 '14 03:10

Tristan Brindle


1 Answers

Some newer allocator-aware classes like std::shared_ptr and std::tuple on the other hand use type-erasure to "hide" information about the allocator, so it does not form part of the type signature.

std::tuple doesn't use type-erasure at all. A tuple can be constructed with an allocator, but it just (conditionally) passes it to its elements, it doesn't store it anywhere, because a tuple never allocates any memory so has no need for an allocator.

std::shared_ptr does allocate memory, so it can use an allocator, which it will store until the control block needs to be deallocated. Since the control block is already invisible to users and stored on the heap, the allocator associated with that control block is also invisible to users.

So the comparison to shared_ptr is not very relevant, because it has completely different uses for an allocator that don't apply to containers.

  1. Is treating std::vector<T, std::allocator<T>> and std::vector<T, MyAllocator> as distinct types considered desirable, or is it just a side effect of type-erasure not being a well-known technique at the time the STL was written?

The original motivation for allocators in the STL was to encapsulate details about the memory model, specifically "near" and "far" pointers of segmented memory. This is why the allocator defines a pointer member which the container uses internally. A vector using near pointers must not mix up addresses of its elements with those in another container using far pointers, for example.

So for the original use, having distinct types was valuable, but that original use is irrelevant these days.

  1. What are the downsides (if any) of using type-erasure in this way?
  • All function calls have to be virtual (or some other form of indirect call e.g. through a function pointer) and are much harder to inline. This isn't a problem for shared_ptr which just allocates some memory once before erasing the allocator type then uses it once more to free the memory, but general-purpose containers might make thousands of allocations.

  • A type-erased allocator is much harder to retrieve from the container, making it complicated to create a copy of the container. (Should it use a copy of the allocator? How do you copy something you can't see?) This isn't a problem for types like shared_ptr because copying a shared_ptr just increases the reference-count, it doesn't allocate anything.

  • The object generally needs to be larger by sizeof(void*) to store the type-erased allocator. That extra pointer can't be optimized away, even if the allocator is an empty, stateless type such as std::allocator<T>. Depending on the type that could mean a 50% or even 100% increase in size compared to being able to exploit the Empty Base-class Optimization to store an empty allocator. This isn't a problem for shared_ptr because the allocator isn't needed except when creating or destroying the control block, so it doesn't need to be accessible for the shared_ptr to use for other (de)allocations.

  • Because a type-erased allocator has to meet an specific abstract interface it has to use raw pointers in its allocate and deallocate members. This mean you can't use a custom pointer type, e.g. a pointer that stores a relative offset to a base address, which is useful for shared-memory allocators as used in Boost.Interprocess.

  1. Should type-erased allocators always be preferred for new containers?

I would say no. If the allocator is part of the type you can optimize it away for the common cases, while still allowing users of the container to choose a polymorphic allocator that uses type erasure internally, such as the ones in the Library Fundamentals TS

like image 73
Jonathan Wakely Avatar answered Oct 02 '22 14:10

Jonathan Wakely