Loop hoisting a volatile read
I have read many places that a volatile variable can not be hoisted from a loop or if, but I cannot find this mentioned any places in the C# spec. Is this a hidden feature?
All writes are volatile in C#
Does this mean that all writes have the same properties without, as with the volatile keyword? Eg ordinary writes in C# has release semantics? and all writes flushes the store buffer of the processor?
Release semantics
Is this a formal way of saying that the store buffer of a processor is emptied when a volatile write is done?
Acquire semantics
Is this a formal way of saying that is should not load a variable into a register, but fetch it from memory every time?
In this article, Igoro speaks of "thread cache". I perfectly understand that this is imaginary, but is he in fact referring to:
Or is this just my imagination?
Delayed writing
I have read many places that writes can be delayed. Is this because of the reordering, and the store buffer?
Memory.Barrier
I understand that a side-effect is a call to "lock or" when JIT is transforming IL to asm, and this is why a Memory.Barrier can solve the delayed write to main memory (in the while loop) in fx this example:
static void Main()
{
bool complete = false;
var t = new Thread (() =>
{
bool toggle = false;
while (!complete) toggle = !toggle;
});
t.Start();
Thread.Sleep (1000);
complete = true;
t.Join(); // Blocks indefinitely
}
But is this always the case? Will a call to Memory.Barrier always flush the store buffer fetch updated values into the processor cache? I understand that the complete variable is not hoisted into a register and is fetched from a processor cache, every time, but the processor cache is updated because of the call to Memory.Barrier.
Am I on thin ice here, or have I some sort of understand of volatile and Memory.Barrier?
volatile in most programming languages does not imply a real CPU read memory barrier but an order to the compiler not to optimize the reads via caching in a register. This means that the reading process/thread will get the value "eventually".
In computing, a memory barrier, also known as a membar, memory fence or fence instruction, is a type of barrier instruction that causes a central processing unit (CPU) or compiler to enforce an ordering constraint on memory operations issued before and after the barrier instruction.
Memory barrier, also known as membar or memory fence, is a class of instructions which cause a central processing unit (CPU) to enforce an ordering constraint on memory operations issued before and after the barrier instruction.
Memory barrier is implemented by the hardware processor. CPUs with different architectures have different memory barrier instructions. Therefore, the programmer needs to explicitly call memory barrier in the code to solve the preceding problem.
That's a mouthful..
I'm gonna start with a few of your questions, and update my answer.
Loop hoisting a volatile
I have read many places that a volatile variable can not be hoisted from a loop or if, but I cannot find this mentioned any places in the C# spec. Is this a hidden feature?
MSDN says "Fields that are declared volatile are not subject to compiler optimizations that assume access by a single thread". This is kind of a broad statement, but it includes hoisting or "lifting" variables out of a loop.
All writes are volatile in C#
Does this mean that all writes have the same properties without, as with the volatile keyword? Eg ordinary writes in C# has release semantics? and all writes flushes the store buffer of the processor?
Regular writes are not volatile. They do have release semantics, but they don't flush the CPU's write-buffer. At least, not according to the spec.
From Joe Duffy's CLR 2.0 Memory Model
Rule 2: All stores have release semantics, i.e. no load or store may move after one.
I've read a few articles stating that all writes are volatile in C# (like the one you linked to), but this is a common misconception. From the horse's mouth (The C# Memory Model in Theory and Practice, Part 2):
Consequently, the author might say something like, “In the .NET 2.0 memory model, all writes are volatile—even those to non-volatile fields.” (...) This behavior isn’t guaranteed by the ECMA C# spec, and, consequently, might not hold in future versions of the .NET Framework and on future architectures (and, in fact, does not hold in the .NET Framework 4.5 on ARM).
Release semantics
Is this a formal way of saying that the store buffer of a processor is emptied when a volatile write is done?
No, those are two different things. If an instruction has "release semantics", then no store/load instruction will ever be moved below said instruction. The definition says nothing regarding flushing the write-buffer. It only concerns instruction re-ordering.
Delayed writing
I have read many places that writes can be delayed. Is this because of the reordering, and the store buffer?
Yes. Write instructions can be delayed/reordered by either the compiler, the jitter or the CPU itself.
So a volatile write has two properties: release semantics, and store buffer flushing.
Sort of. I prefer to think of it this way:
The C# Specification of the volatile keyword guarantees one property: that reads have acquire-semantics and writes have release-semantics. This is done by emitting the necessary release/acquire fences.
The actual Microsoft's C# implementation adds another property: reads will be fresh, and writes will be flushed to memory immediately and be made visible to other processors. To accomplish this, the compiler emits an OpCodes.Volatile
, and the jitter picks this up and tells the processor not to store this variable on its registers.
This means that a different C# implementation that doesn't guarantee immediacy will be a perfectly valid implementation.
Memory Barrier
bool complete = false;
var t = new Thread (() =>
{
bool toggle = false;
while (!complete) toggle = !toggle;
});
t.Start();
Thread.Sleep(1000);
complete = true;
t.Join(); // blocks
But is this always the case? Will a call to Memory.Barrier always flush the store buffer fetch updated values into the processor cache?
Here's a tip: try to abstract yourself away from concepts like flushing the store buffer, or reading straight from memory. The concept of a memory barrier (or a full-fence) is in no way related to the two former concepts.
A memory barrier has one sole purpose: ensure that store/load instructions below the fence are not moved above the fence, and vice-versa. If C#'s Thread.MemoryBarrier
just so happens to flush pending writes, you should think about it as a side-effect, not the main intent.
Now, let's get to the point. The code you posted (which blocks when compiled in Release mode and ran without a debugger) could be solved by introducing a full fence anywhere inside the while
block. Why? Let's first unroll the loop. Here's how the first few iterations would look like:
if(complete) return;
toggle = !toggle;
if(complete) return;
toggle = !toggle;
if(complete) return;
toggle = !toggle;
...
Because complete
is not marked as volatile
and there are no fences, the compiler and the cpu are allowed to move the read of the complete
field.
In fact, the CLR's Memory Model (see rule 6) allows loads to be deleted (!) when coalescing adjacent loads. So, this could happen:
if(complete) return;
toggle = !toggle;
toggle = !toggle;
toggle = !toggle;
...
Notice that this is logically equivalent to hoisting the read out of the loop, and that's exactly what the compiler may do.
By introducing a full-fence either before or after toggle = !toggle
, you'd prevent the compiler from moving the reads up and merging them together.
if(complete) return;
toggle = !toggle;
#FENCE
if(complete) return;
toggle = !toggle;
#FENCE
if(complete) return;
toggle = !toggle;
#FENCE
...
In conclusion, the key to solving these issues is ensuring that the instructions will be executed in the correct order. It has nothing to do with how long it takes for other processors to see one processor's writes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With