Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can't you unbox directly to a type which you can explicity cast to? [duplicate]

Tags:

c#

.net

Ran into this as part of some EF/DB code today and ashamed to say I'd never encountered it before.

In .NET you can explicitly cast between types. e.g.

int x = 5;
long y = (long)x;

And you can box to object and unbox back to that original type

int x = 5; 
object y = x;
int z = (int)y;

But you can't unbox directly to a type that you can explicitly cast to

int x = 5;
object y = x;
long z = (long)y;

This is actually documented behaviour, although I never actually run into it until today. https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/types/boxing-and-unboxing#example-1

For the unboxing of value types to succeed at run time, the item being unboxed must be a reference to an object that was previously created by boxing an instance of that value type. Attempting to unbox null causes a NullReferenceException. Attempting to unbox a reference to an incompatible value type causes an InvalidCastException.

I'm curious is there some technical reason why this isn't possible/supported by the runtime?

like image 300
Eoin Campbell Avatar asked Nov 07 '22 14:11

Eoin Campbell


1 Answers

The cast syntax (Foo)bar C# does one of these things:

  • cast a reference type to another reference type
  • convert a value type to another value type
  • box a value type
  • unbox a boxed value type

These operations are semantically very different. It really makes more sense to think of them as four distinct operations which by historical accident happen to share the same (Foo)bar syntax. In particular they have different constraints on what information need to be known at compile time:

  • an unboxing operation need to know the type of the unboxed value
  • a value type conversion need to know both the source and target types.

The is basically because the compiler needs to know at compile time how many bytes to allocate to the values. In your example, the information that the boxed value is an int is not available at compile time, which means neither the unboxing nor the conversion to a long can be compiled.

What is counter-intuitive here is that the same constraints does not apply to reference types. Indeed the whole point of casting reference types is that the compiler don't know the exact type at compile time. You use a cast when you know better then the compiler, and the compiler accepts that, and then at runtime performs a type check to ensure that cast is valid.

This is possible due to to some fundamental differences in reference types:

  • A reference type instance knows its own exact type at runtime. It is stored as part of the instance data.
  • Reference types are polymorphic, which means the compiler does not need to know the exact instance type. All references have the same size, so there is no ambiguity about how many bytes to allocate.

These semantic differences between the different kinds of casts means they cannot be merged without compromising safety.

Lets say C# supported unbox-and-convert in a single cast expression:

int x = 70000;
object y = x;
short z = (short)y;

Currently an unboxing cast indicates that you expect that the boxed value is of the given type. If this is not the case, an exception is thrown, so you discover the bug. But a value-type conversion using cast syntax indicates that you know the types are different and that the conversion may lead to data loss.

If the language would automatically unbox and convert then there would be no way to express if you wanted a safe unboxing without any risk of data loss.

like image 138
JacquesB Avatar answered Nov 15 '22 01:11

JacquesB