Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Least-restrictive memory ordering for single-producer, single-consumer ringbuffer?

I have a RingBuffer which services one consumer and one producer and uses two integers to detect new data:

_lastReadIndex
_lastWrittenIndex

so there is unread data in the ringbuffer when these two values are not equal.

The Producer increments (and moduluses with the ringbuffer size to wrap-around) _lastWrittenIndex when an item is added to the ringbuffer.

The Consumer spins, reading both values, checking for new data and when there is, it will increment (and modulus) _lastReadIndex.

The three highlighted terms emphasise the requirements with regard to multithreading and memory barriers.

How far can I relax the memory ordering for this design, accounting for Intel's memory model? I believe Intel's memory model allows loads to be re-ordered with earlier stores to different addresses?

EDIT using the C++11 atomic library std::memory_order_xxxx etc

like image 880
user997112 Avatar asked Jan 23 '16 18:01

user997112


1 Answers

A few things you have to do before anything else:

Modulus the reading and writing points, but keep _lastReadIndex and _lastWrittenIndex intact to know how much data you have available, how much is lost, or mayhaps block on writer if it overruns the reader after full cycle.

And, very importantly, avoid sharing as much as possible - put the reader and writer variables on separate cache lines.

Now, to your question:

If you're trying to be portable, the memory ordering you'd require in your code should not consider the architecture. The standard atomic functions can take care of this. You only need to make sure data available in the buffer before incrementing the write index, which means release semantics on the increment. You also have to make sure the writer writes data to memory, and not optimized to remain only in registers.

newIndex = _lastWrittenIndex+1;
buffer[newIndex % bufSize] = newData;
atomic_store( &_lastWrittenIndex, newIndex, memory_order_release );

On x86/64, this will be the same as:

newIndex = _lastWrittenIndex+1;
buffer[newIndex % bufSize] = newData;
// release semantics means reorder barrier before action:
barrier(); // translates to `asm volatile("":::"memory");`
*(volatile int*)_lastWrittenIndex = newIndex;

When writing code that accesses _lastWrittenIndex not more than absolutely necessary, like above, you may as well declare it volatile but keep in mind the barrier is still needed!

like image 176
BitWhistler Avatar answered Oct 08 '22 16:10

BitWhistler