Memory Model: preventing store-release and load-acquire reordering

Tags:

It is known that, unlike Java's volatiles, .NET's ones allow reordering of volatile writes with the following volatile reads from another location. When it is a problem MemoryBarier is recommended to be placed between them, or Interlocked.Exchange can be used instead of volatile write.

It works but MemoryBarier could be a performance killer when used in highly optimized lock-free code.

I thought about it a bit and came with an idea. I want somebody to tell me if I took the right way.

So, the idea is the following:

We want to prevent reordering between these two accesses:

 volatile1 write

 volatile2 read

From .NET MM we know that :

 1) writes to a variable cannot be reordered with  a  following read from 
    the same variable
 2) no volatile accesses can be eliminated
 3) no memory accesses can be reordered with a previous volatile read

To prevent unwanted reordering between write and read we introduce a dummy volatile read from the variable we've just written to:

 A) volatile1 write
 B) volatile1 read [to a visible (accessible | potentially shared) location]
 C) volatile2 read

In such case B cannot be reordered with A as they both access the same variable, C cannot be reordered with B because two volatile reads cannot be reordered with each other, and transitively C cannot be reordered with A.

And the question:

Am I right? Can that dummy volatile read be used as a lightweight memory barrier for such scenario?

708

asked May 15 '13 18:05

OmariO

2 Answers

Here I will use an arrow notation to conceptualize the memory barriers. I use an up arrow ↑ and a down arrow ↓ to represent volatile writes and reads respectively. Think of the arrow head as pushing away any other reads or writes. So no other memory access can move past the arrow head, but they can move past the tail.

Consider your first example. This is how it would be conceptualized.

↑          
volatile1 write  // A
volatile2 read   // B
↓

So clearly we can see that the read and the write are allowed to switch positions. You are correct.

Now consider your second example. You claimed that introducing a dummy read would prevent the write of A and the read of B from getting swapped.

↑          
volatile1 write  // A
volatile1 read   // A
↓
volatile2 read   // B
↓

We can see that B is prevented from floating up by the dummy read of A. We can also see that the read of A cannot float down because, by inference, that would be the same as B moving up before A. But, notice that we have no ↑ arrow that would prevent the write to A from floating down (remember it can still move past the tail of an arrow). So no, at least theoretically, injecting a dummy read of A will not prevent the original write of A and the read of B from getting swapped because the write to A is still allowed to move downward.

I had to really think about this scenario. One thing I pondered for a quite some time is whether the read and write to A are locked together in tandem. If so then that would prevent the write to A from moving down because it would have to take the read with it which we already said was prevented. So if you go with that school of thought then your solution might just work. But, I read the specification again and I see nothing special mentioned about volatile accesses to the same variable. Obviously, the thread has to execute in a manner that is logically consistent with the original program sequence (that is mentioned in the specification). But, I can visualize ways the compiler or hardware could optimize (or otherwise reorder) that tandem access of A and still get the same result. So, I simply have to side with caution here and assume that the write to A can move down. Remember, a volatile read does not mean "fresh read from main memory". The write to A could be cached in a register and then the read comes from that register delaying the actual write to a later time. Volatile semantics do not prevent that scenario as far as I know.

The correct solution would be to put a call to Thread.MemoryBarrier in between the accesses. You can see how this is conceptualized with the arrow notation.

↑          
volatile1 write       // A
↑
Thread.MemoryBarrier
↓
volatile2 read        // B
↓

Now you can see that the read is not allowed to float up and the write is not allowed to float down preventing the swap.

You can see some of my other memory barrier answers using this arrow notation here, here, and here just to name a few.

answered Oct 02 '22 15:10

Brian Gideon

I forgot to post the soon found answer back to SO. Better late than never..

Turns out it is impossible thanks to how processors (at least x86-x64 kind of them) optimize memory accesses. I found the answer when was reading Intel manuals on its procs. Example 8-5:" Intra-Processor Forwarding is Allowed" was looking suspicious. Googling for "store buffer forwarding" lead to Joe Duffy's blog posts (first and second - read them pls).

To optimize writes processor uses store buffers (per processor queues of write ops). Buffering writes locally allows it to do next optimization: satisfying reads from the previously buffered writes to the same memory location and which haven't left the processor yet. The technique is called store-buffer forwarding (or store-to-load forwarding).

The end result in our case is that as reading at B is satisfied from a local storage (store buffer) it is not considered a volatile read and can be reordered with further volatile reads from another memory location (C).

It seems like a violation of the rule "Volatile reads don't reorder with each other". Yes, it is a violation, but very rare and exotic one. Why did it happen? Probably because Intel's released its first formal document on memory model of its processors years after .NET (and its JIT compiler) saw the sunlight.

So the answer is: no, the dummy reading (B) doesn't prevent reordering between A and C and cannot be used as a lightweight memory barrier.

answered Oct 02 '22 16:10

OmariO

Related questions
                            
                                Which Encryption algorithm does ProtectedData use?
                            
                                Queue calls to an async method
                            
                                Xamarin Installation issue
                            
                                Iterate through N points that are perpendicular to another line
                            
                                Html.ActionLink very slow
                            
                                Is there any reason the C# / .NET compiler(s) do not warn about Dispose()?
                            
                                How to add array to a SQL row in C#?
                            
                                Out of memory exception in .net winform treeview
                            
                                How can I get the "Selected MenuItem" in WPF
                            
                                Attempting to upload to FTP: System.Net.WebException: System error
                            
                                Stopwatch in a Task seems to be additive across all tasks, want to measure just task interval
                            
                                How to sort a dynamic array of heterogeneous numbers?
                            
                                Is it possible to create a Portable class library with Roslyn?
                            
                                Initialize RetryManager from EnterpriseLibraryContainer not working
                            
                                C# hold key in a game application
                            
                                What is the deserialize equivalent of ISerializable.GetObjectData?
                            
                                How to pass HttpContext.Current to methods called using Parallel.Invoke() in .net
                            
                                How can I use VB.Net extension methods in a C# project
                            
                                Form tells wrong size on Windows 8 — how to get real size?
                            
                                Organising Azure Media assets in Blob storage

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Memory Model: preventing store-release and load-acquire reordering

Tags:

performance

c#

.net

memory-model

volatile

OmariO

People also ask

2 Answers

Brian Gideon

OmariO

Recent Activity

Donate For Us