Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can memory reordering cause C# to access unallocated memory?

It is my understanding that C# is a safe language and doesn't allow one to access unallocated memory, other than through the unsafe keyword. However, its memory model allows reordering when there is unsynchronized access between threads. This leads to race hazards where references to new instances appear to be available to racing threads before the instances have been fully initialized, and is a widely known problem for double-checked locking. Chris Brumme (from the CLR team) explains this in their Memory Model article:

Consider the standard double-locking protocol:

if (a == null)
{
    lock(obj)
    {
        if (a == null) 
            a = new A();
    }
}

This is a common technique for avoiding a lock on the read of ‘a’ in the typical case. It works just fine on X86. But it would be broken by a legal but weak implementation of the ECMA CLI spec. It’s true that, according to the ECMA spec, acquiring a lock has acquire semantics and releasing a lock has release semantics.

However, we have to assume that a series of stores have taken place during construction of ‘a’. Those stores can be arbitrarily reordered, including the possibility of delaying them until after the publishing store which assigns the new object to ‘a’. At that point, there is a small window before the store.release implied by leaving the lock. Inside that window, other CPUs can navigate through the reference ‘a’ and see a partially constructed instance.

I've always been confused by what "partially constructed instance" means. Assuming that the .NET runtime clears out memory on allocation rather than garbage collection (discussion), does this mean that the other thread might read memory that still contains data from garbage-collected objects (like what happens in unsafe languages)?

Consider the following concrete example:

byte[] buffer = new byte[2];

Parallel.Invoke(
    () => buffer = new byte[4],
    () => Console.WriteLine(BitConverter.ToString(buffer)));

The above has a race condition; the output would be either 00-00 or 00-00-00-00. However, is it possible that the second thread reads the new reference to buffer before the array's memory has been initialized to 0, and outputs some other arbitrary string instead?

like image 986
Douglas Avatar asked Jul 04 '18 21:07

Douglas


People also ask

How do I stop compiler reordering?

To prevent compiler reorderings at other times, you must use a compiler-specific barrier. GCC uses __asm__ __volatile__("":::"memory"); for this purpose. This is different from CPU reordering, a.k.a. the memory-ordering model.

Can compiler reorder memory access operations?

The compiler will do its best to provide the memory model you have requested by issuing the correct memory barriers, in case the underlying hardware implements a weak model. It also takes care of any memory reordering operation that might occur on the software side.

Why CPU reorder instructions?

Thinking in terms of instruction reordering lets us assume that once the instruction is finally executed, all cores will immediately see that new value in memory. It also lets us understand the CPU from the point of the user program rather than trying to understand the internals.

What is memory order C++?

(since C++20) std::memory_order specifies how memory accesses, including regular, non-atomic memory accesses, are to be ordered around an atomic operation.


1 Answers

Let's not bury the lede here: the answer to your question is no, you will never observe the pre-allocated state of memory in the CLR 2.0 memory model.

I'll now address a couple of your non-central points.

It is my understanding that C# is a safe language and doesn't allow one to access unallocated memory, other than through the unsafe keyword.

That is more or less correct. There are some mechanisms by which one can access bogus memory without using unsafe -- via unmanaged code, obviously, or by abusing structure layout. But in general, yes, C# is memory safe.

However, its memory model allows reordering when there is unsynchronized access between threads.

Again, that's more or less correct. A better way to think about it is that C# allows reordering at any point where the reordering would be invisible to a single threaded program, subject to certain constraints. Those constraints include introducing acquire and release semantics in certain cases, and preserving certain side effects at certain critical points.

Chris Brumme (from the CLR team) ...

The late great Chris's articles are gems and give a great deal of insight into the early days of the CLR, but I note that there have been some strengthenings of the memory model since 2003 when that article was written, particularly with respect to the issue you raise.

Chris is right that double-checked locking is super dangerous. There is a correct way to do double-checked locking in C#, and the moment you depart from it even slightly, you are off in the weeds of horrible bugs that only repro on weak memory model hardware.

does this mean that the other thread might read memory that still contains data from garbage-collected objects

I think your question is not specifically about the old weak ECMA memory model that Chris was describing, but rather about what guarantees are actually made today.

It is not possible for re-orderings to expose the previous state of objects. You are guaranteed that when you read a freshly-allocated object, its fields are all zeros.

This is made possible by the fact that all writes have release semantics in the current memory model; see this for details:

http://joeduffyblog.com/2007/11/10/clr-20-memory-model/

The write that initializes the memory to zero will not be moved forwards in time with respect to a read later.

I've always been confused by "partially constructed objects"

Joe discusses that here: http://joeduffyblog.com/2010/06/27/on-partiallyconstructed-objects/

Here the concern is not that we might see the pre-allocation state of an object. Rather, the concern here is that one thread might see an object while the constructor is still running on another thread.

Indeed, it is possible for the constructor and the finalizer to be running concurrently, which is super weird! Finalizers are hard to write correctly for this reason.

Put another way: the CLR guarantees you that its own invariants will be preserved. An invariant of the CLR is that newly allocated memory is observed to be zeroed out, so that invariant will be preserved.

But the CLR is not in the business of preserving your invariants! If you have a constructor which guarantees that field x is true if and only if y is non-null, then you are responsible for ensuring that this invariant is always observed to be true. If in some way this is observed by two threads, then one of those threads might observe the invariant being violated.

like image 134
Eric Lippert Avatar answered Sep 29 '22 10:09

Eric Lippert