Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does the debugger get type information about an object initialized to null?

Tags:

c#

.net

debugging

If an object is initialized to null, it is not possible to get the type information because the reference doesn't point to anything.

However, when I debug and I hover over a variable, it shows the type information. Only the static methods are shown, but still, it seems to know the type. Even in release builds.

Does the debugger use other information than just reflection of some sort to find out the datatype? How come it knows more than I? And if it knows this, why isn't it capable of showing the datatype in a NullReferenceException?

like image 510
Abel Avatar asked Jan 12 '12 16:01

Abel


2 Answers

It seems like you're confusing the type of the reference with the type of the value that it points to. The type of the reference is embedded into the DLL metadata and as readily accessible by the debugger. There is also aditional information stored in the associated PDB that the debugger leverages to provide a better experience. Hence even for null references a debugger can determine information like type and name.

As for NullReferenceException. Could it also tell you the type on which it was querying a field / method ... possibly. I'm not familiar with the internals of this part of the CLR but there doesn't seem to be an inherent reason why it couldn't do so.

But I'm not sure the added cost to the CLR would be worth the benefit. I share the frustration about the lack of information for a null ref exception. But more than the type involved I want names! I don't care that it was an IComparable, i wanted to know it was leftCustomer.

Names are somethnig the CLR doesn't always have access to as a good portion of them live in the PDB and not metadata. Hence it can't provide them with great reliability (or speed)

like image 140
JaredPar Avatar answered Oct 04 '22 17:10

JaredPar


Jared's answer is of course correct. Just to add a little to it:

when I debug and I hover over a variable, it shows the type information

Right. You have a bowl. The bowl is labelled "FRUIT". The bowl is empty. What is the type of the fruit in the bowl? You cannot say, because there isn't any fruit in the bowl. But that does not mean that you know nothing about the bowl. You know that the bowl can contain any fruit.

When you hover over a variable then the debugger can tell you about the variable itself or about its contents.

Does the debugger use other information than just reflection of some sort to find out the datatype?

Absolutely. The debugger needs to know not just what is the type of the thing referred to by this reference but also what restrictions are placed on what can be stored in this variable. All the information about what restrictions are placed on particular storage locations are known to the runtime, and the runtime can tell that information to the debugger.

How come it knows more than I?

I reject the premise of the question. The debugger is running on your behalf; it cannot do anything that you cannot do yourself. If you don't know what the type restriction on a particular variable is, it's not because you lack the ability to find out. You just haven't looked yet.

if it knows this, why isn't it capable of showing the datatype in a NullReferenceException?

Think about what is actually happening when you dereference null. Suppose for example you do this:

Fruit f = null;
string s = f.ToString();

ToString might be overloaded in Fruit. What code must the jitter generate? Let's suppose that local variable f is stored in a stack location. The jitter says:

  • copy the contents of the memory address at the stack pointer offset associated with f to register 1
  • The virtual function table is going to be, lets say eight bytes from the top of that pointer, and ToString is going to be, let's say, four bytes from the top of that table. (I am just making these numbers up; I don't know what the real offsets are off the top of my head.) So, start by adding eight to the current contents of register 1.
  • Now dereference the current contents of register 1 to get the address of the vtable into register 2
  • Now add four bytes to register 2
  • Now we have a pointer to the ToString method...

But hold on a minute, let's follow that logic again. The first step puts zero into register 1, because f contains null. The second step adds eight to that. The third step dereferences pointer 0x00000008, and the virtual memory system issues an exception stating that an illegal memory page has just been touched. The CLR handles the exception, determines that the exception happened on the first 64 K of memory, and guesses that someone has just dereferenced a null pointer. It therefore creates a null reference exception and throws it.

The virtual memory system surely does not know that the reason it dereferenced pointer 0x00000008 was because someone was trying to call f.ToString(). That information is lost in the past; the memory manager's job is to tell you when you touched something you don't have any right to touch; why you tried to touch memory you don't own is not its job to figure out.

The CLR could maintain a separate side data structure such that every time you accessed memory, it made a note of why you were attempting to do so. That way, the exception could have more information in it, describing what you were doing when the exception happened. Imagine the cost of maintaining such a data structure for every access to memory! Managed code could easily be ten times slower than it is today, and that cost is borne just as heavily by correct code as by broken code. And for what? To tell you what you can easily figure out yourself: which variable that contains null that you dereferenced.

The feature isn't worth the cost, so the CLR does not do it. There's no technical reason why it could not; it's just not practical.

like image 39
Eric Lippert Avatar answered Oct 04 '22 16:10

Eric Lippert