We do come across this particular and one of the most common exception in our coding/development life day or another day. My Question is NOT about WHY (I am aware it raises when we try to access properties of a reference variable which actually points to null) but its is about HOW the NULL REFERENCE EXCEPTION is generated by CLR. Sometimes I am forced to think the mechanism for identifying a reference to a null (Perhaps null is a reserved space in memory) and then raising an Exception by CLR. How CLR identify and raises this particular Exception. Does OS play any role in it? I would like to share one of the most interesting claims about it: <blockquote> null is actually an all time reserved memory space known to CLR, and all kind of access are prohibited. Thus , when reference for that space is found, it by default generates access denied kind of exception via OS which is interpreted as a NULL Reference Exception by CLR. </blockquote> I didn't found any articles or posts supporting the above statement, thus hard to believe it. Might by I am missing to dig in details or other reasons, I expect Stackoverflow is one of the most appropriate platform where I will get the best response.

It doesn't have to be (there could be explicit checks), but it works from trapping access violation exceptions. A .NET object will be turned into a native object: Its fields become a block of memory laid out in a particular manner, its methods are jitted into native machine code methods, and a v-table or other virtual method overload mechanism is created. <ol> <li>Accessing a field then, means finding the address of the object, adding on the offset of the member, and reading or writing the piece of memory referred to.</li> <li>Calling a virtual method, means finding the address of the object, finding its method table (set offset within object), finding the method's address (set offset within the table) and calling the method at that address with the address of the object being passed (the <code>this</code> pointer).</li> <li>Calling a non-virtual method, means calling the method with the address of the object passed (the <code>this</code> pointer).</li> </ol> Clearly if there is not an actual object at the address in question cases 1 and 2 will go wrong in some way, while case 3 will work (but could in turn lead to case 1 or 2). There's two main ways this can go wrong: <ol> <li>It could access an arbitrary bit of memory that is not really an object of our type, leading to all sorts of exciting and really hard to trace bugs (.NET code generally won't result in anything that causes this scenario).</li> <li>It could access an arbitrary bit of memory that is protected, leading to an access violation.</li> </ol> You may know about the second case from C, C++ or ASM coding. If not, you'll probably still have seen a program crash and with its dying breath talk about an access violation at some address. If so, you may have noticed that while the address given could be just about anything, it'll most often be either 0x00000000 or something very low like 0x00000020. Those were caused by code trying to dereference a null pointer whether to access a field or call a virtual method (which is essentially accessing a field and then calling depending on what you get). Now, since the first 64k or memory is always protected, dereferencing a null pointer will always result in the second case (access violation) rather than the first case (arbitrary memory being mis-used and resulting in bizarre "fandango on the core" bugs). This is all exactly the same with .NET (or rather, with the jitted code produced by it), but if (A) the access violation happened at an address lower than 0x00010000 and (B) such a violation is found to have happened by code that was jitted, then it is turned into a <code>NullReferenceException</code>, otherwise it gets turned into an <code>AccessViolationException</code>. We can simulate this with code that doesn't dereference, but which does access protected memory (we'll only read, so if we should happen to accidentally hit memory that isn't protected, the result won't be too weird!): The following code will raise an AccessViolationException: <pre class="prettyprint"><code>unsafe { int read = *((int*)long.MaxValue - 8); } </code></pre> The following code will raise a NullReferenceException: <pre class="prettyprint"><code>unsafe { int read = *((int*)8); } </code></pre> Neither code is actually dereferencing anything. Both cause access violations, but the CLR assumes the later was probably caused by a null reference (in fairness, by far the most likely scenario) and raises it. So, we can see how field access and <code>callvirt</code> can cause this. It's worth noting now that because of a decision to not allow C# to call methods on null references even when safe to do so, <code>callvirt</code> is used as the IL for the majority of cases in C#, and the only exceptions would be cases of static methods or where it can be shown at compile time to not be on a null reference. (Edit: There are a few other cases where the compiler can see that a <code>callvirt</code> can be replaced by a <code>call</code>, even when the method actually is virtual [if the compiler can tell which overload would be hit] and the later compilers will do this slightly more often, though it will still use <code>callvirt</code> more often than you might imagine). An interesting case is where optimisation means that a method called with <code>callvirt</code> could be inlined, but where it isn't known at compile-time to be guaranteed non-null. In such a case a field access may be added before the place where where the "call" (that isn't really a call) happens, precisely to trigger the <code>NullReferenceException</code> at the start, rather than in the middle, of the method. This means the optimisation does not change the observed behaviour.

What is the CLR implementation behind raising/generating a null reference exception?

Tags:

memory-management

c#

.net

nullreferenceexception

clr

We do come across this particular and one of the most common exception in our coding/development life day or another day. My Question is NOT about WHY (I am aware it raises when we try to access properties of a reference variable which actually points to null) but its is about HOW the NULL REFERENCE EXCEPTION is generated by CLR.

Sometimes I am forced to think the mechanism for identifying a reference to a null (Perhaps null is a reserved space in memory) and then raising an Exception by CLR. How CLR identify and raises this particular Exception. Does OS play any role in it?

I would like to share one of the most interesting claims about it:

null is actually an all time reserved memory space known to CLR, and all kind of access are prohibited. Thus , when reference for that space is found, it by default generates access denied kind of exception via OS which is interpreted as a NULL Reference Exception by CLR.

I didn't found any articles or posts supporting the above statement, thus hard to believe it. Might by I am missing to dig in details or other reasons, I expect Stackoverflow is one of the most appropriate platform where I will get the best response.

629

asked Jun 28 '12 06:06

Sumeet

2 Answers

It doesn't have to be (there could be explicit checks), but it works from trapping access violation exceptions.

A .NET object will be turned into a native object: Its fields become a block of memory laid out in a particular manner, its methods are jitted into native machine code methods, and a v-table or other virtual method overload mechanism is created.

Accessing a field then, means finding the address of the object, adding on the offset of the member, and reading or writing the piece of memory referred to.
Calling a virtual method, means finding the address of the object, finding its method table (set offset within object), finding the method's address (set offset within the table) and calling the method at that address with the address of the object being passed (the this pointer).
Calling a non-virtual method, means calling the method with the address of the object passed (the this pointer).

Clearly if there is not an actual object at the address in question cases 1 and 2 will go wrong in some way, while case 3 will work (but could in turn lead to case 1 or 2). There's two main ways this can go wrong:

It could access an arbitrary bit of memory that is not really an object of our type, leading to all sorts of exciting and really hard to trace bugs (.NET code generally won't result in anything that causes this scenario).
It could access an arbitrary bit of memory that is protected, leading to an access violation.

You may know about the second case from C, C++ or ASM coding. If not, you'll probably still have seen a program crash and with its dying breath talk about an access violation at some address. If so, you may have noticed that while the address given could be just about anything, it'll most often be either 0x00000000 or something very low like 0x00000020. Those were caused by code trying to dereference a null pointer whether to access a field or call a virtual method (which is essentially accessing a field and then calling depending on what you get).

Now, since the first 64k or memory is always protected, dereferencing a null pointer will always result in the second case (access violation) rather than the first case (arbitrary memory being mis-used and resulting in bizarre "fandango on the core" bugs).

This is all exactly the same with .NET (or rather, with the jitted code produced by it), but if (A) the access violation happened at an address lower than 0x00010000 and (B) such a violation is found to have happened by code that was jitted, then it is turned into a NullReferenceException, otherwise it gets turned into an AccessViolationException.

We can simulate this with code that doesn't dereference, but which does access protected memory (we'll only read, so if we should happen to accidentally hit memory that isn't protected, the result won't be too weird!):

The following code will raise an AccessViolationException:

unsafe
{
  int read = *((int*)long.MaxValue - 8);
}

The following code will raise a NullReferenceException:

unsafe
{
  int read = *((int*)8);
}

Neither code is actually dereferencing anything. Both cause access violations, but the CLR assumes the later was probably caused by a null reference (in fairness, by far the most likely scenario) and raises it.

So, we can see how field access and callvirt can cause this.

It's worth noting now that because of a decision to not allow C# to call methods on null references even when safe to do so, callvirt is used as the IL for the majority of cases in C#, and the only exceptions would be cases of static methods or where it can be shown at compile time to not be on a null reference. (Edit: There are a few other cases where the compiler can see that a callvirt can be replaced by a call, even when the method actually is virtual [if the compiler can tell which overload would be hit] and the later compilers will do this slightly more often, though it will still use callvirt more often than you might imagine).

An interesting case is where optimisation means that a method called with callvirt could be inlined, but where it isn't known at compile-time to be guaranteed non-null. In such a case a field access may be added before the place where where the "call" (that isn't really a call) happens, precisely to trigger the NullReferenceException at the start, rather than in the middle, of the method. This means the optimisation does not change the observed behaviour.

answered Nov 14 '22 15:11

Jon Hanna

The MS implementation, IIRC, does this via an access violation. Null is essentially a zero reference, and basically: they deliberately reserve that address space and leave this page unmapped. The memory access violation is raised at the CPU/OS level automatically (i.e. without needing extra code to do a null check), and the CLI then reports this as a null-reference exception.

Interestingly, because memory is handled in pages, you can actually simulate (if you try hard enough) a null-reference exception on a non-zero but low value, for the same reasons.

Edit: Eric Lippert discusses this on this related question/answer: https://stackoverflow.com/a/8681563

answered Nov 14 '22 13:11

Marc Gravell

Related questions
                            
                                Should I put this function in View (code-behind) or in ViewModel?
                            
                                Why does closing a nested child dialog also close the parent dialog?
                            
                                Dictionary with lock or Concurency Dictionary?
                            
                                How to create folder structure in SDL Tridion 2011 SP1 using Core Service
                            
                                Parse Cell Location String into Row & Column
                            
                                What C# feature allows the use of an "object literal" type notation?
                            
                                How can I average a DateTime field with a LINQ query?
                            
                                Find parent control of ToolStripMenuItem
                            
                                Serialize object of multiple classes into a single JSON using Json.NET
                            
                                Why is Dictionary.Add overwriting all items in my dictionary?
                            
                                How to handle exception without using try catch?
                            
                                WebBrowserControl: UnauthorizedAccessException when accessing property of a Frame
                            
                                Client credentials to get TOKEN of Facebook
                            
                                Updating a progress bar in a C# GUI from another thread and class [duplicate]
                            
                                Parsing string for Domain / hostName
                            
                                how can I create a truly immutable doubly linked list in C#?
                            
                                What is the meaning of '@' sign in cshtml?
                            
                                Text replace in VBA code of Excel files
                            
                                Memory Leak when using PrincipalSearcher.FindAll()
                            
                                A few questions regarding an HL7 Listener

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With