An article in MSDN Magazine discusses the notion of Read Introduction and gives a code sample which can be broken by it. <pre class="prettyprint"><code>public class ReadIntro { private Object _obj = new Object(); void PrintObj() { Object obj = _obj; if (obj != null) { Console.WriteLine(obj.ToString()); // May throw a NullReferenceException } } void Uninitialize() { _obj = null; } } </code></pre> Notice this "May throw a NullReferenceException" comment - I never knew this was possible. So my question is: how can I protect against read introduction? I would also be really grateful for an explanation exactly when the compiler decides to introduce reads, because the article doesn't include it.

Let me try to clarify this complicated question by breaking it down. <blockquote> What is "read introduction"? </blockquote> "Read introduction" is an optimization whereby the code: <pre class="prettyprint"><code>public static Foo foo; // I can be changed on another thread! void DoBar() { Foo fooLocal = foo; if (fooLocal != null) fooLocal.Bar(); } </code></pre> is optimized by eliminating the local variable. The compiler can reason that if there is only one thread then <code>foo</code> and <code>fooLocal</code> are the same thing. The compiler is explicitly permitted to make any optimization that would be invisible on a single thread, even if it becomes visible in a multithreaded scenario. The compiler is therefore permitted to rewrite this as: <pre class="prettyprint"><code>void DoBar() { if (foo != null) foo.Bar(); } </code></pre> And now there is a race condition. If <code>foo</code> turns from non-null to null after the check then it is possible that <code>foo</code> is read a second time, and the second time it could be null, which would then crash. From the perspective of the person diagnosing the crash dump this would be completely mysterious. <blockquote> Can this actually happen? </blockquote> As the article you linked to called out: <blockquote> Note that you won’t be able to reproduce the NullReferenceException using this code sample in the .NET Framework 4.5 on x86-x64. Read introduction is very difficult to reproduce in the .NET Framework 4.5, but it does nevertheless occur in certain special circumstances. </blockquote> x86/x64 chips have a "strong" memory model and the jit compilers are not aggressive in this area; they will not do this optimization. If you happen to be running your code on a weak memory model processor, like an ARM chip, then all bets are off. <blockquote> When you say "the compiler" which compiler do you mean? </blockquote> I mean the jit compiler. The C# compiler never introduces reads in this manner. (It is permitted to, but in practice it never does.) <blockquote> Isn't it a bad practice to be sharing memory between threads without memory barriers? </blockquote> Yes. Something should be done here to introduce a memory barrier because the value of <code>foo</code> could already be a stale cached value in the processor cache. My preference for introducing a memory barrier is to use a lock. You could also make the field <code>volatile</code>, or use <code>VolatileRead</code>, or use one of the <code>Interlocked</code> methods. All of those introduce a memory barrier. (<code>volatile</code> introduces only a "half fence" FYI.) Just because there's a memory barrier does not necessarily mean that read introduction optimizations are not performed. However, the jitter is far less aggressive about pursuing optimizations that affect code that contains a memory barrier. <blockquote> Are there other dangers to this pattern? </blockquote> Sure! Let's suppose there are no read introductions. You still have a race condition. What if another thread sets <code>foo</code> to null after the check, and also modifies global state that <code>Bar</code> is going to consume? Now you have two threads, one of which believes that <code>foo</code> is not null and the global state is OK for a call to <code>Bar</code>, and another thread which believes the opposite, and you're running <code>Bar</code>. This is a recipe for disaster. <blockquote> So what's the best practice here? </blockquote> First, do not share memory across threads. This whole idea that there are two threads of control inside the main line of your program is just crazy to begin with. It never should have been a thing in the first place. Use threads as lightweight processes; give them an independent task to perform that does not interact with the memory of the main line of the program at all, and just use them to farm out computationally intensive work. Second, if you are going to share memory across threads then use locks to serialize access to that memory. Locks are cheap if they are not contended, and if you have contention, then fix that problem. Low-lock and no-lock solutions are notoriously difficult to get right. Third, if you are going to share memory across threads then every single method you call that involves that shared memory must either be robust in the face of race conditions, or the races must be eliminated. That is a heavy burden to bear, and that is why you shouldn't go there in the first place. My point is: read introductions are scary but frankly they are the least of your worries if you are writing code that blithely shares memory across threads. There are a thousand and one other things to worry about first.

Read Introduction in C# - how to protect against it?

Tags:

An article in MSDN Magazine discusses the notion of Read Introduction and gives a code sample which can be broken by it.

public class ReadIntro {   private Object _obj = new Object();   void PrintObj() {     Object obj = _obj;     if (obj != null) {       Console.WriteLine(obj.ToString()); // May throw a NullReferenceException     }   }   void Uninitialize() {     _obj = null;   } }

Notice this "May throw a NullReferenceException" comment - I never knew this was possible.

So my question is: how can I protect against read introduction?

I would also be really grateful for an explanation exactly when the compiler decides to introduce reads, because the article doesn't include it.

851

asked Feb 10 '13 16:02

Gebb

1 Answers

Let me try to clarify this complicated question by breaking it down.

What is "read introduction"?

"Read introduction" is an optimization whereby the code:

public static Foo foo; // I can be changed on another thread! void DoBar() {   Foo fooLocal = foo;   if (fooLocal != null) fooLocal.Bar(); }

is optimized by eliminating the local variable. The compiler can reason that if there is only one thread then foo and fooLocal are the same thing. The compiler is explicitly permitted to make any optimization that would be invisible on a single thread, even if it becomes visible in a multithreaded scenario. The compiler is therefore permitted to rewrite this as:

void DoBar() {   if (foo != null) foo.Bar(); }

And now there is a race condition. If foo turns from non-null to null after the check then it is possible that foo is read a second time, and the second time it could be null, which would then crash. From the perspective of the person diagnosing the crash dump this would be completely mysterious.

Can this actually happen?

As the article you linked to called out:

Note that you won’t be able to reproduce the NullReferenceException using this code sample in the .NET Framework 4.5 on x86-x64. Read introduction is very difficult to reproduce in the .NET Framework 4.5, but it does nevertheless occur in certain special circumstances.

x86/x64 chips have a "strong" memory model and the jit compilers are not aggressive in this area; they will not do this optimization.

If you happen to be running your code on a weak memory model processor, like an ARM chip, then all bets are off.

When you say "the compiler" which compiler do you mean?

I mean the jit compiler. The C# compiler never introduces reads in this manner. (It is permitted to, but in practice it never does.)

Isn't it a bad practice to be sharing memory between threads without memory barriers?

Yes. Something should be done here to introduce a memory barrier because the value of foo could already be a stale cached value in the processor cache. My preference for introducing a memory barrier is to use a lock. You could also make the field volatile, or use VolatileRead, or use one of the Interlocked methods. All of those introduce a memory barrier. (volatile introduces only a "half fence" FYI.)

Just because there's a memory barrier does not necessarily mean that read introduction optimizations are not performed. However, the jitter is far less aggressive about pursuing optimizations that affect code that contains a memory barrier.

Are there other dangers to this pattern?

Sure! Let's suppose there are no read introductions. You still have a race condition. What if another thread sets foo to null after the check, and also modifies global state that Bar is going to consume? Now you have two threads, one of which believes that foo is not null and the global state is OK for a call to Bar, and another thread which believes the opposite, and you're running Bar. This is a recipe for disaster.

So what's the best practice here?

First, do not share memory across threads. This whole idea that there are two threads of control inside the main line of your program is just crazy to begin with. It never should have been a thing in the first place. Use threads as lightweight processes; give them an independent task to perform that does not interact with the memory of the main line of the program at all, and just use them to farm out computationally intensive work.

Second, if you are going to share memory across threads then use locks to serialize access to that memory. Locks are cheap if they are not contended, and if you have contention, then fix that problem. Low-lock and no-lock solutions are notoriously difficult to get right.

Third, if you are going to share memory across threads then every single method you call that involves that shared memory must either be robust in the face of race conditions, or the races must be eliminated. That is a heavy burden to bear, and that is why you shouldn't go there in the first place.

My point is: read introductions are scary but frankly they are the least of your worries if you are writing code that blithely shares memory across threads. There are a thousand and one other things to worry about first.

171

answered Dec 18 '22 02:12

Eric Lippert

Related questions
                            
                                Cannot POST form node.js - express
                            
                                What is a clean way to send a body with DELETE request?
                            
                                Quick way to toggle 'break on all exceptions' in VS2012?
                            
                                How to setup Karma runner code coverage?
                            
                                Using 'window', 'document' and 'undefined' as arguments in anonymous function that wraps a jQuery plugin
                            
                                Using machine learning to de-duplicate data
                            
                                So changed Git's default editor, now how do i invoke it from Git bash?
                            
                                Python: detect when a socket disconnects for any reason?
                            
                                Can you set conditional dependencies for Python 2 and 3 in setuptools?
                            
                                Bootstrap 3 > trying to create columns with equal heights
                            
                                rename class with file name in one step in Visual Studio
                            
                                Google Glass GDK: How to Communicate with Android Device

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With