Simple lock free stack c++11

Question

I've seen several overly complicated (in my opinion obviously) implementation of a lock free stack in c++ (using tags like here) and I came up with what I believe is a simple and still valid implementation. Since I couldn't find this implementation anywhere (I've seen the Push function implemented similarly to what I did but not the Pop), I'm guessing that it's incorrect in some way (most likely fails the ABA case):

template<typename Data>
struct Element
{
  Data mData;
  Element<Data>* mNext;
};

template<typename Data>
class Stack
{
 public:
  using Obj = Element<Data>;
  std::atomic<Obj*> mHead;

  void Push(Obj *newObj)
  {
    newObj->mNext = mHead.load();
    //Should I be using std::memory_order_acq_rel below??
    while(!mHead.compare_exchange_weak(newObj->mNext, newObj));
  }

  Obj* Pop()
  {
    Obj* old_head = mHead.load();
    while (1)
    {
      if (old_head == nullptr)
        return nullptr;
      //Should I be using std::memory_order_acq_rel below??
      if(mHead.compare_exchange_weak(old_head, old_head->mNext)) ///<<< CL1
        return old_head;
    }
  }
};

I'm assuming that the callers of Push and Pop will take care of memory allocation and deallocation. Another option is to make the above Push and Pop private methods and have new public functions that would take care of memory allocation and call these functions internally. I believe the most tricky part of this implementation is the line I marked with "CL1". The reason I think it's correct and still works in the ABA case is the following:

Lets say the ABA case does happen. Meaning that mHead at "CL1" would be equal to old_head but the object they're both pointing to would actually be different than the one old_head was originally pointing to when I assigned mHead to it. But, I think that even if it's a different object we're still OK, since we know it's a valid "head". old_head is pointing to the same object as mHead is, so it IS the valid head of the stack and this means that old_head->mNext is the valid next head. So, updating mHead to old_head->mNext is still correct!

To summarize:

If mHead != old_head (another thread preempted us and changed mHead) -> old_head is updated to be the new mHead and we start the loop again.
[NON-ABA] If mHead == old_head -> simple case, update mHead to be old_head->next (==mHead->mNext) and return old_head.
[ABA] If mHead == old_head -> works as explained above.

So, is my implementation valid? What am I missing?

Casey · Accepted Answer

ABA happens when:

Thread A reads old_head->mNext and blocks before calling compare_exchange_weak.
Thread B pops the current node pushes some other nodes, then pushes the original node back onto the stack.
Thread A unblocks, completes compare_exchange_weak successfully since mHead has the same value, but stores the stale mNext value as the new mHead.

See this answer for more details, you have both problem #2 (data race on mNext) and problem #3 (ABA).

Yttrill · Answer

In general, implementations can be correct if the data type is bitwise regular and suffer inevitably from ABA if merely semi-regular (or equality is not bitwise).

If the data type is bitwise regular the CAS operation is enough, for example for integers. ABA isn't a problem, because A equals A, end of story.

Harris algorithm appears to allow non-bitwise regular types to provide a set implementation, but at the expense of unbounded performance, and worse, it begs the question how the nodes of the lists are allocated. This implies the core problem to be solved is to provide an O(1) lock free allocator and such a data structure cannot exist because the pointers involved are only semi-regular. In particular, whilst a pointer is regular when it merely identifies an array of bytes as a block of memory, when it identifies a node of a linked list, it is no longer regular because pointer equality does not ensure the embedded next pointers are equal.

There is no possible work around. I doubt DCAS will help and I doubt relaxing the constant time constraint to linear does either.

With non-preemptable threads a spinlock will work provided the critical section it protects is bounded time. You can get these on MacOS and RT-Linux.

Simple lock free stack c++11

Tags:

c++

stack

multithreading

c++11

lock-free

dcmm88

2 Answers

Casey

Yttrill

Recent Activity

Donate For Us

Simple lock free stack c++11

Tags:

c++

stack

multithreading

c++11

lock-free

dcmm88

2 Answers

Casey

Yttrill

Related questions

Recent Activity

Donate For Us