Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Looking for critique of my thread safe, lock-free queue implementation

So, I've written a queue, after a bit of research. It uses a fixed-size buffer, so it's a circular queue. It has to be thread-safe, and I've tried to make it lock-free. I'd like to know what's wrong with it, because these kinds of things are difficult to predict on my own.

Here's the header:

template <class T>
class LockFreeQueue
{
public:
    LockFreeQueue(uint buffersize) : buffer(NULL), ifront1(0), ifront2(0), iback1(0), iback2(0), size(buffersize) { buffer = new atomic <T>[buffersize]; }
    ~LockFreeQueue(void) { if (buffer) delete[] buffer; }

    bool pop(T* output);
    bool push(T input);

private:
    uint incr(const uint val)
        {return (val + 1) % size;}

    atomic <T>* buffer;
    atomic <uint> ifront1, ifront2, iback1, iback2;
    uint size;
};

And here's the implementation:

template <class T>
bool LockFreeQueue<T>::pop(T* output)
{
    while (true)
    {
        /* Fetch ifront and store it in i. */
        uint i = ifront1;

        /* If ifront == iback, the queue is empty. */
        if (i == iback2)
            return false;

        /* If i still equals ifront, increment ifront, */
        /* Incrememnting ifront1 notifies pop() that it can read the next element. */
        if (ifront1.compare_exchange_weak(i, incr(i)))
        {
            /* then fetch the output. */
            *output = buffer[i];
            /* Incrememnting ifront2 notifies push() that it's safe to write. */
            ++ifront2;
            return true;
        }

        /* If i no longer equals ifront, we loop around and try again. */
    }
}

template <class T>
bool LockFreeQueue<T>::push(T input)
{
    while (true)
    {
        /* Fetch iback and store it in i. */
        uint i = iback1;

        /* If ifront == (iback +1), the queue is full. */
        if (ifront2 == incr(i))
            return false;

        /* If i still equals iback, increment iback, */
        /* Incrememnting iback1 notifies push() that it can write a new element. */
        if (iback1.compare_exchange_weak(i, incr(i)))
        {
            /* then store the input. */
            buffer[i] = input;
            /* Incrementing iback2 notifies pop() that it's safe to read. */
            ++iback2;
            return true;
        }

        /* If i no longer equals iback, we loop around and try again. */
    }
}

EDIT: I made some major modifications to the code, based on comments (Thanks KillianDS and n.m.!). Most importantly, ifront and iback are now ifront1, ifront2, iback1, and iback2. push() will now increment iback1, notifying other pushing threads that they can safely write to the next element (as long as it's not full), write the element, then increment iback2. iback2 is all that gets checked by pop(). pop() does the same thing, but with the ifrontn indices.

Now, once again, I fall into the trap of "this SHOULD work...", but I don't know anything about formal proofs or anything like that. At least this time, I can't think of a potential way that it could fail. Any advice is appreciated, except for "stop trying to write lock-free code".

like image 679
Haydn V. Harach Avatar asked Feb 04 '14 19:02

Haydn V. Harach


2 Answers

The proper way to approach a lock free data structure is to write a semi formal proof that your design works in pseudo code. You shouldn't be asking "is this lock free code thread safe", but rather "does my proof that this lock free code is thread safe have any errors?"

Only after you have a formal proof that a pseudo code design works do you try to implement it. Often this brings to light issues like garbage collection that have to be handled carefully.

Your code should be the formal proof and pseudo code in comments, with the relatively unimportant implementation interspersed within.

Verifying your code is correct then consists of understanding the pseudo code, checking the proof, then checking for failure for your code to map to your pseudo code and proof.

Directly taking code and trying to check that it is lock free is impractical. The proof is the important thing in correctly designing this kind of thing, the actual code is secondary, as the proof is the hard part.

And after and while you have done all of the above, and have other people validate it, you have to put your code through practical tests to see if you have a blind spot and there is a hole, or don't understand your concurrency primitives, or if your concurrency primitives have bugs in them.

If you aren't interested in writing semi formal proofs to design your code, you shouldn't be hand rolling lock free algorithms and data structures and putting them into place in production code.

Determining if a pile of code "is thread safe" is putting all of the work load on other people. You need to have an argument why your code "is thread safe" arranged in such a way that it is as easy as possible for others to find holes in it. If your argument why your code "is thread safe" is arranged in ways that makes it harder to find holes, your code cannot be presumed to be thread safe, even if nobody can spot a hole in your code.

The code you posted above is a mess. It contains commented out code, no formal invariants, no proofs that the lines, no strong description of why it is thread safe, and in general does not put forward an attempt to show itself as thread safe in a way that makes it easy to spot flaws. As such, no reasonable reader will consider the code thread safe, even if they cannot find any errors in it.

like image 93
Yakk - Adam Nevraumont Avatar answered Dec 16 '22 16:12

Yakk - Adam Nevraumont


No, it's not thread safe - consider the following sequence if events:

  1. First thread completes if (ifront.compare_exchange_weak(i, incr(i))) in pop and goes to sleep by scheduler.
  2. Second thread calls push size times (just enough to make ifront be equal to value of i in the first thread).
  3. First thread wakes.

In this case pop buffer[i] will contain the last pushed value, which is wrong.

like image 36
Alex Telishev Avatar answered Dec 16 '22 16:12

Alex Telishev