For the third time in a few years I find myself needing an intrusive linked list for a project that does not allow boost (ask management...).
For the third time I find that the intrusive linked list implementation I have works perfectly, but that I really don't like that it makes use of undefined behavior - namely when converting the pointer to a list node into a pointer to the object containing that list node.
That awful code currently looks like this:
struct IntrusiveListNode {
IntrusiveListNode * next_;
IntrusiveListNode * prev_;
};
template <typename T, IntrusiveListNode T::*member>
class IntrusiveList {
// snip ...
private:
T & nodeToItem_(IntrusiveListNode & node) {
return *(T*)(((char*)&node)-((size_t)&(((T*)nullptr)->*member)));
}
IntrusiveListNode root_;
};
I don't really care how ugly nodeToItem_
gets, but I would like to keep the public interface and syntax of IntrusiveList
the same. Specifically, I would like to specify the type of a list type using IntrusiveList<Test, &Test::node_>
rather than IntrusiveList<Test, offsetof(Test, node_)>
.
It's almost 2016 - is there any way to do this without invoking undefined behavior?
Edit: There have been a few suggested solutions (involving different structures of the list) in the comments which I want to summarize here:
Live with undefined behavior, since the language has seemingly arbitrary limitations that prevent using member pointers in reverse.
Store an additional pointer to the containing class within IntrusiveListNode
. This is currently probably the cleanest solution (no change in interface necessary), but does require a third pointer in every list node (small optimizations may be possible).
Derive from IntrusiveListNode
and use static_cast
. In boost this is the base_hook
version of an intrusive linked list. I would like to stick with the member_hook
version to avoid introducing multiple inheritance.
Store pointers to the next and previous containing class instead of to the next and previous list node within IntrusiveListNode
. This makes creating the root node within the intrusive list difficult. Either the list must include a full instantiation of T
(which is not possible e.g. if T
is abstract), or the end of the list would need to be a null pointer (which would break --list.end()
, allowing forward iteration only).
Boost intrusive lists have a member_hook
version that works somehow, but the implementation has not been understood (and it possibly also relies on undefined behavior).
The question remains: is it possible to make an intrusive member-based list with bidirectional iteration support, no undefined behavior, and no "unnecessary" memory overhead?
Intrusive linked lists are a variation of linked lists where the links are embedded in the structure that’s being linked. In a typical linked list implementation, a list node contains a data pointer to the linked data and a next pointer to the next node in the list.
There are two main reasons to use intrusive lists over non-intrusive linked lists: Fewer memory allocations. Less cache thrashing. With non-intrusive linked lists, creating a new object and adding it to a list requires two memory allocations: one for the object, and one for the list node.
In a typical linked list implementation, a list node contains a data pointer to the linked data and a next pointer to the next node in the list. In an intrusive linked list implementation, the list node contains next pointer to the next list node, but no data pointer because the list is embedded in the linked object itself.
A doubly linked list is a linked list that keeps pointers to both the next node and the previous node. The list structure would contain an extra prev pointer: Doubly linked lists make deletion and insertion easier, because you only need a reference to a single node to perform the deletion or insertion.
I would side-step the problem and use a node<T>
containing suitable
members to link the range. To deal with a bidirectional, intrusive
list I'd use an asymmetric node<T>
like this:
template <typename T>
class intrusive::node
{
template <typename S, node<S> S::*> friend class intrusive::list;
template <typename S, node<S> S::*> friend class intrusive::iterator;
T* next;
node<T>* prev;
public:
node(): next(), prev() {}
node(node const&) {}
void operator=(node const&) {}
};
The basic idea is that the list<T, L>
contains a node<T>
using
the next
pointer to point to the first element. That's fairly
straight forward: given a pointer p
to a T
the link to the next
node can be traversed using (p->*L).next
. However, instead of
directly navigating the list using T*
, an iterator<T, L>
actually
uses a pointer to the node<T>
: while that isn't necessary for
forward traversal, it enables backward traversal and insertions
anywhere in the list without special treatment of the list head.
The copy constructor and copy assignment are defined to do nothing
to avoid half-inserted nodes when copying a node. Depending on the
needs of the nodes it may be more reasonable to rather = delete
these operations. However, that's unrelated to the question at hand.
The iterator just uses a pointer to the node<T>
whose next
member points at the current node. For the first element in the
list this is a pointer to the list<T, L>
's node<T>
member.
Assuming you got a pointer to a suitable node<T>
an iterator<T,
L>
can be created from that:
template <typename T, intrusive::node<T> T::*Link>
class intrusive::iterator
{
template <typename S, node<S> S::*> friend class intrusive::list;
node<T>* current;
public:
explicit iterator(node<T>* current): current(current) {}
T& operator*() { return *this->operator->(); }
T* operator->() { return this->current->next; }
bool operator== (iterator const& other) const {
return this->current == other.current;
}
bool operator!= (iterator const& other) const {
return !(*this == other);
}
iterator& operator++() {
this->current = &(this->current->next->*Link);
return *this;
}
iterator operator++(int) {
iterator rc(*this);
this->operator++();
return rc;
}
iterator& operator--() {
this->current = this->current->prev;
return *this;
}
iterator operator--(int) {
iterator rc(*this);
this->operator--();
return rc;
}
};
Dereferencing just uses the next
pointer. The same is true for
forward iteration which uses the next
pointer together with the
member pointer to get hold of the address of the next node<T>
.
Since the iterator's prev
already pointesr at a node<T>
backward
iteration just needs to replace the current node<T>
with the
prev
element.
Finally, this leaves a list maintaining the beginning and the end of the list. Dealing with the bidirectional access and the corresponding access to the last node adds a bit of complexity and the need to actually have a dedicated node. Here is an implementation (which isn't thoroughly tested: I may have messed up some of the links):
template <typename T, intrusive::node<T> T::*Link>
class intrusive::list
{
node<T> content;
public:
list() { this->content.prev = &this->content; }
iterator<T, Link> begin() { return iterator<T, Link>(&this->content); }
iterator<T, Link> end() { return iterator<T, Link>(this->content.prev); }
T& front() { return *this->content.next; }
T& back() { return *(this->content.prev->prev->next); }
bool empty() const { return &this->content == this->content.prev; }
void push_back(T& node) { this->insert(this->end(), node); }
void push_front(T& node) { this->insert(this->begin(), node); }
void insert(iterator<T, Link> pos, T& node) {
(node.*Link).next = pos.current->next;
((node.*Link).next
? (pos.current->next->*Link).prev
: this->content.prev) = &(node.*Link);
(node.*Link).prev = pos.current;
pos.current->next = &node;
}
iterator<T, Link> erase(iterator<T, Link> it) {
it.current->next = (it.current->next->*Link).next;
(it.current->next
? (it.current->next->*Link).prev
: this->content.prev) = it.current;
return iterator<T, Link>(&(it.current->next->*Link));
}
};
Just for a bit of sanity: here is a function to simply print the list:
template <typename T, intrusive::node<T> T::*Link>
std::ostream& intrusive::operator<< (std::ostream& out, intrusive::list<T, Link>& list)
{
out << "[";
if (!list.empty()) {
std::copy(list.begin(), --list.end(), std::ostream_iterator<T>(out, ", "));
out << list.back();
}
return out << "]";
}
There are few other approaches avoiding the need to do any funky access to the enclosing class. The above avoids a couple of conditions. Assuming I managed to set the appropriate links correct the code would not rely on any implementation defined or undefined behavior.
You'd use the list like this:
class Node {
public:
intrusive::node<Node> link0;
intrusive::node<Node> link1;
int n;
Node(int n): n(n) {}
};
std::ostream& operator<< (std::ostream& out, Node const& n) {
return out << n.n;
}
int main()
{
intrusive::list<Node, &Node::link0> l0;
intrusive::list<Node, &Node::link1> l1;
Node n[] = { 10, 11, 12, 13, 14, 15 };
l0.push_front(n[0]);
l0.push_front(n[1]);
l0.push_front(n[2]);
l1.push_back(n[0]);
l1.push_back(n[1]);
l1.push_back(n[2]);
std::cout << "l0=" << l0 << " l1=" << l1 << "\n";
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With