Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how are C# object references represented in memory / at runtime (in the CLR)?

I'm curious to know how C# object references are represented in memory at runtime (in the .NET CLR). Some questions that come to mind are:

  1. How much memory does an object reference occupy? Does it differ when defined in the scope of a class vs the scope of a method? Does where it live differ based on this scope (stack vs heap)?

  2. What is the actual data maintained within an object reference? Is it simply a memory address that points to the object it refers to or is there more to it? Does this differ based on whether it is defined within the scope of a class or method?

  3. Same questions as above, but this time when talking about a reference to a reference, as in when a object reference is passed to a method by reference. How do the answers to 1 and 2 change?

like image 965
chopperdave Avatar asked Feb 29 '12 00:02

chopperdave


2 Answers

.NET Heaps and Stacks This is a thorough treatment of how the stack and heap work.

C# and many other heap-using OOP languages in general reference-speak use Handles not Pointers for references in this context (C# is also capable of using Pointers!) Pointer analogies work for some general concepts, but this conceptual model breaks down for questions like this. See Eric Lippert's excellent post on this topic Handles are Not Addresses

It is not appropriate to say a Handle is the size of a pointer. (although it may coincidentally be the same) Handles are aliases for objects, it isn't required they be a formal address to an object.

In this case the CLR happens to use real addresses for the handles: From the above link:

...the CLR actually does implement managed object references as addresses to objects owned by the garbage collector, but that is an implementation detail.

So yes a handle is probably 4 bytes on a 32 bit architecture, and 8 bytes on a 64 byte architecture, but this is not a "for sure", and it is not directly because of pointers. It is worth noting depending on compiler implementation and the address ranges used some types of pointers can be different in size.

With all of this context you can probably model this by a pointer analogy, but it's important to realize Handles are not required to be addresses. The CLR could choose to change this if it wanted to in the future and consumers of the CLR shouldn't know any better.

A final drive of this subtle point:

This is a C# Pointer:

int* myVariable;

This is a C# Handle:

object myVariable;

They are not the same.

You can do things like math on pointers, that you shouldn't do with Handles. If your handle happens to be implemented like a pointer and you use it as if it were a pointer you are misusing the Handle in some ways that could get you in trouble later on.

like image 113
Joshua Enfield Avatar answered Nov 12 '22 12:11

Joshua Enfield


This answer is most easily understood if you understand C/C++ pointers. A pointer is a simply the memory address of some data.

  1. An object reference should be the size of a pointer, which is normally 4 bytes on a 32-bit CPU, and 8 bytes on a 64-bit CPU. It is the same regardless of where it is defined. Where it lives does depend on where it is defined. If it is a field of a class, it will reside on the heap in the object it is part of. If it is a static field, it is located in a special section of the heap that is not subject to garbage collection. If it is a local variable, it lives on the stack.

  2. An object reference is simply a pointer, which can be visualized as an int or long containing the address of the object in memory. It is the same regardless of where it is defined.

  3. This is implemented as a pointer to a pointer. The data is the same - just a memory address. However, there is no object at the given memory address. Instead, there is another memory address, which is the original reference to the object. This is what allows a reference parameter to be modified. Normally, a parameter disappears when its method completes. Since the reference to the object is not a parameter, then changes to this reference will remain. The reference to a reference will disappear, but not the reference. This is the purpose for passing reference parameters.

One thing you should know, value types are stored in place (there is no memory address, instead they are stored directly where the memory address would be - See #1). When they are passed to a method, a copy is made and that copy is used in the method. When they are passed by reference, a memory address is passed which locates the value type in memory, allowing it to be changed.

Edit: As dlev pointed out, these answers are not the hard and fast rule, since there is no rule that says this is how it must be. .NET is free to implement these questions however it wants. This is the most likely way to implement it though, as this is how the Intel CPU's work internally, so using any other method would likely be inefficient.

Hope I didn't confuse you too much, but feel free to ask if you need clarification.

like image 42
Kendall Frey Avatar answered Nov 12 '22 11:11

Kendall Frey