Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast move assignment with Howard Hinnant's short_alloc

I am using Howard Hinnant's nice little arena-based allocator, short_alloc.

It struck me that move-assigning from a vector, which has outgrown its arena and is thus allocated on heap, could be done using the usual, fast move-assignment (i.e., grabbing the target's resources). However, this is not the case:

typedef arena<16>                     arena_type;
typedef short_alloc<int, 16>          alloc_type;
typedef std::vector<int, alloc_type>  vec_type;

arena_type arena1, arena2;
vec_type vec1(alloc_type(arena1)), vec2(alloc_type(arena2));
vec1.resize(100);

void* data = vec1.data();
vec2 = std::move(vec1);
assert(vec2.data() == data);  // fails

As explained in this answer, this is due to the vector's move assignment operator comparing the two allocators (note that propagate_on_container_move_assignment is std::false_type). Since the two allocators don't compare equal (because they have different arenas), the target vector needs to allocate memory and move the values one by one.

The desired behaviour is achieved by changing the equality operator to

template <class T1, size_t N1, class T2, size_t N2>
bool operator==(const short_alloc<T1, N1>& x, const short_alloc<T2, N2>& y) noexcept
{
    return N1 == N2 && (&x.a_ == &y.a_ || y.a_.on_heap());
}

where on_heap() checks if the allocator is not using its arena.

This solution looks rather hackish (note for example that equality is not symmetric), can/will I shoot myself in the foot by doing this? Is there a elegant solution?

like image 365
marton78 Avatar asked Oct 07 '13 16:10

marton78


1 Answers

Two different arena objects may have different lifetime. Two different short_alloc objects that depend on different arena objects manage memory with different lifetime. As a result, two std::vector objects with different short_alloc objects can't simply move pointers between them.

Your hack won't work, since it is the pointer that is allocated either from the arena or by new[]. Your hack assumes that the allocator becomes a heap allocator for big vectors, but this is not the case. This is not something that an allocator knows without inspecting the requested size or the freed pointer.

The correct solution is to replace the allocator object with the move operator. To do that, short_alloc should define:

   using propagate_on_container_move_assignment = std::true_type;
private:
   arena_type * a_;  // <--- instead of reference
public:
   short_alloc(const short_alloc&) = default;
   // !!! Don't delete the move assignment !!!
   // short_alloc& operator=(const short_alloc&) = delete;

This will make the move operator work as expected. It will start using the other arena after the move.


In general, such allocation techniques are very dangerous and should be rarely used, due to their high risk of memory-related bugs. For example, if one is to return a vector from a function, there is a high risk for it to refer to an arena on a scope that is about to be exited.

With my proposed change, the risk factor is slightly higher. The problem of the arena going out-of-scope now also goes to pass-by-reference vectors. The problem of the arena going out-of-scope also exists when the arena is defined in an inner block.

This behavior of the other arena going out of scope can surprise a programmer, and introduce bugs. That's why I am not a fan of this solution. However, sometimes one is willing to write dangerous code in time-critical sections (after profiling and analyzing).


As the question suggests, it is possible to mark a short_alloc allocator as a heap allocator when applicable. It can be marked this way right after the first allocation that uses new[]. This will work well with std::vector, since it holds only one chunk of memory between method calls. Despite working with std::vector, it breaks with most other containers, because most of them use nodes, such as std::map and std::unordered_set.

The problem is that some of the nodes come from the arena and some from the heap. With the suggested operator== that returns true if new[] is used, a move from std::map will make some nodes from an unrelated arena move the the target std::map. Very unexpected, and a bad idea. This will result in a std::map object that contains nodes from its own arena and from an unrelated arena. The nodes of the unrelated arena will never be freed by std::map. These bad nodes will be freed only when their allocating arena dies.

The technique proposed in the question is completely broken. It results with inconsistent allocations in surprising ways for almost anything but std::vector. I will strongly advise against it.

like image 90
Michael Veksler Avatar answered Oct 22 '22 12:10

Michael Veksler