I understand what boxing is. A value type is boxed to an object/reference type and is then stored on managed heap as an object. But I can't get thru unboxing.
Unboxing converts your object/reference type back to the value type
int i = 123; // A value type
object box = i; // Boxing
int j = (int)box; // Unboxing
Alright. But if I try to unbox a value type into another value type, for example, long in above example, it throws InvalidCastException
long d = (long)box;
It leaves me with an idea that may be runtime implicitly knows the actual TYPE of value type boxed inside "box" object. If I am right, I wonder where this type information is stored.
EDIT:
Since int
is implicitly convertible to long
. This is what confusing me.
int i = 123;
long lng = i;
is perfectly fine because it has no boxing/unboxing involved.
Boxing is used to store value types in the garbage-collected heap. Boxing is an implicit conversion of a value type to the type object or to any interface type implemented by this value type.
Boxed values are data structures that are minimal wrappers around primitive types*. Boxed values are typically stored as pointers to objects on the heap. Thus, boxed values use more memory and take at minimum two memory lookups to access: once to get the pointer, and another to follow that pointer to the primitive.
Boxing and unboxing enables a unified view of the type system wherein a value of any type can ultimately be treated as an object. With Boxing and unboxing one can link between value-types and reference-types by allowing any value of a value-type to be converted to and from type object.
Boxing and unboxing are the processes that enable value types (e.g., integers) to be treated as reference types (objects). The value is “boxed” inside an Object and subsequently “unboxed” back to a value type. It is this process that allowed you to call the ToString( ) method on the integer in Example 6-4.
When a value is boxed it gets an object header. The kind that any type that derives from System.Object has. The value follows that header. The header contains two fields, one is the "syncblk", it has various uses that are beyond the scope of the question. The second field describes the type of object.
That's the one you are asking about. It has various names in literature, most commonly "type handle" or "method table pointer". The latter is the most accurate description, it is a pointer to the info the CLR keeps track of whenever it loads a type. Lots of framework features depend on it. Object.GetType() of course. Any cast in your code as well as the is and as operators use it. These casts are safe so you can't turn a Dog into a Cat, the type handle provides this guarantee. The method table pointer for your boxed int points to the method table for System.Int32
Boxing was very common in .NET 1.x, before generics became available. All of the common collection types stored object instead of T. So putting an element in the collection required (implicit) boxing, getting it out again required explicit unboxing with a cast.
To make this efficient, it was pretty important that the jitter didn't need to consider the possibility that a conversion would be required. Because that requires a lot more work. So the C# language included the rule that unboxing to another type is illegal. All that's needed now is a check on the type handle to ensure it is expected type. The jitter directly compares the method table pointer to the one for System.Int32 in your case. And the value embedded in the object can be copied directly without any conversion concerns. Pretty fast, as fast as it can possibly be, this can all be done with inline machine code without any CLR call.
This rule is specific to C#, VB.NET doesn't have it. Typical trade-off between those two languages, C#'s focus is on speed, VB.NET on convenience. Converting to another type when unboxing isn't otherwise a problem, all simple value types implement IConvertible. You write it explicit in your code, using the Convert helper class:
int i = 123; // A value type
object box = i; // Boxing
long j = Convert.ToInt64(box); // Conversion + unboxing
Which is pretty similar to the code that the VB.NET compiler auto-generates.
It's because boxing instruction adds value type token into result object MSDN. When you are unboxing value from object, this variable is known type (and size in memory). Therefore you must cast object to original value type.
In your example you even don't need to cast it from int to long, because it's an implicit cast.
It is because when you do boxing instead of moving the value type from stack to heap, it creates a copy of it in heap and stores the reference of it in stack in a new stack box. So your original stack object i.e. value type object along with its data type information remains in the stack and maintains its history. Now at the time of unboxing, it compares the type of object from heap to original data type in stack and if it finds mismatch gives the error. So, it is necessary to use same data type that you boxed while doing unboxing.
Every reference object has a bunch of metadata associated with it. This includes the exact type of the given object (which is why you can have type safety at all).
So while the int
is by-value, this information is actually missing (not that it matters), but once you box it, it creates a new object with all the necessary metadata. This also means that while an int
is just 4 bytes, a boxed int
is much more than that - you've got a reference now (4-8 bytes), the value itself (4) and the metadata (which includes the specific type handle). This is very different from e.g. C++, which allows you to cast any pointer to a pointer of any type (and leaving you to deal with the errors when you cast it wrong).
Again, all the by-reference objects have this metadata. This is quite an important part of the cost of reference types, but it is also the means by which you can be sure of the type safety. This also nicely shows how expensive ArrayList
of int
can really be, and why int[]
or List<int>
is much more efficient - even ignoring the costs of allocating (and more importantly collecting) heap objects and the boxing and unboxing itself, the 4 byte int could suddenly be 20 bytes, just because you're storing a reference to it :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With