In the process of trying to understand how to deal with lock free code, I attempted to write a single consumer/single producer lock free queue. As always, I checked papers, articles, and code, especially considering that this is a somewhat delicate subject.
So, I stumbled upon an implementation of this data structure in the Folly library, which can be found here : https://github.com/facebook/folly/blob/master/folly/ProducerConsumerQueue.h
As every lock-free queue I saw, this one seems to use a circular buffer, so we got two std::atomic<unsigned int>
variables : readIndex_
and writeIndex_
. The readIndex_
indicate the next index at which we will read, and writeIndex_
the next at which we will write. Seems simple enough.
So, the implementation seems clean and pretty simple at first sight, but I found one thing troublesome. Indeed, some functions like isEmpty()
, isFull()
or guessSize()
are using std::memory_order_consume
to retrieve the value of the indices.
And to be fair, I really don't know what purpose they do serve. Don't get me wrong, I'm aware of the use of std::memory_order_consume
in the classical case of dependency-carrying through an atomic pointer, but here, we do not seem to carry any dependency ! We just got indices, unsigned integers, we do not create dependencies. To me in this scenario, a std::memory_order_relaxed
is equivalent.
However, I do not trust myself to understand memory ordering better than those who engineered this code, hence why I ask this question here. Is there anything I missed or misunderstood ?
I thank you in advance for your answers !
I thought the same thing a few months ago, so I submitted this pull request back in October, suggesting they change the std::memory_order_consume
loads into std::memory_order_relaxed
since the consume simply did not make sense, as there were no dependencies that could be carried from one thread to another using these functions. It ended up generating some discussion which revealed that a possible use case for isEmpty()
, isFull()
, and sizeGuess
was the following:
//Consumer
while( queue.isEmpty() ) {} // spin until producer writes
use_queue(); // At this point, the writes from producer _should_ be visible
Which is why they explained that std::memory_order_relaxed
wouldn't be appropriate and std::memory_order_consume
would be. However, this is only true because std::memory_order_consume
is promoted to std::memory_order_acquire
on all compilers that I know of. So although std::memory_order_consume
may appear to be providing the proper synchronization, it's quite misleading to leave that in the code and assume that it will remain correct, especially if std::memory_order_consume
were to ever be implemented as intended. The above use-case would fail to work on weaker architectures since the appropriate synchronization would not generated.
What they really need is to make those loads std::memory_order_acquire
for this to work as intended, which is why I submitted this other pull request some days back. Alternatively, they could take the acquire-loads out of the loop and use a fence at the end:
//Consumer
while( queue.isEmpty() ) {} // spin until producer writes using relaxed loads
std::atomic_thread_fence(std::memory_order_acquire);
use_queue(); // At this point, the writes from producer _should_ be visible
Either way, std::memory_order_consume
is incorrectly used here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With