Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can't generic types have explicit layout?

If one tries to make a generic struct with the [StructLayout(LayoutKind.Explicit)] attribute, using the struct generates an exception at runtime:

System.TypeLoadException: Could not load type 'foo' from assembly 'bar' because generic types cannot have explicit layout.

I've been having a hard time finding any evidence that this restriction even exists. The Type.IsExplicitLayout docs strongly imply that it is allowed and supported. Does anyone know why this isn't allowed? I can't think of any reason why generic types would make it less verifiable. It strikes me as an edge case that they simply didn't bother to implement.

Here's an example of why explicit generic layout would be useful:

public struct TaggedUnion<T1,T2>
{
    public TaggedUnion(T1 value) { _union=new _Union{Type1=value}; _id=1; }
    public TaggedUnion(T2 value) { _union=new _Union{Type2=value}; _id=2; }

    public T1 Type1 { get{ if(_id!=1)_TypeError(1); return _union.Type1; } set{ _union.Type1=value; _id=1; } }
    public T2 Type2 { get{ if(_id!=2)_TypeError(2); return _union.Type2; } set{ _union.Type2=value; _id=2; } }

    public static explicit operator T1(TaggedUnion<T1,T2> value) { return value.Type1; }
    public static explicit operator T2(TaggedUnion<T1,T2> value) { return value.Type2; }
    public static implicit operator TaggedUnion<T1,T2>(T1 value) { return new TaggedUnion<T1,T2>(value); }
    public static implicit operator TaggedUnion<T1,T2>(T2 value) { return new TaggedUnion<T1,T2>(value); }

    public byte Tag {get{ return _id; }}
    public Type GetUnionType() {switch(_id){ case 1:return typeof(T1);  case 2:return typeof(T2);  default:return typeof(void); }}

    _Union _union;
    byte _id;
    void _TypeError(byte id) { throw new InvalidCastException(/* todo */); }

    [StructLayout(LayoutKind.Explicit)]
    struct _Union
    {
        [FieldOffset(0)] public T1 Type1;
        [FieldOffset(0)] public T2 Type2;
    }
}

usage:

TaggedUnion<int, double> foo = 1;
Debug.Assert(foo.GetUnionType() == typeof(int));
foo = 1.0;
Debug.Assert(foo.GetUnionType() == typeof(double));
double bar = (double) foo;

Edit:

To be clear, note that layouts aren't verified at compile time even if the struct isn't generic. Reference overlap and x64 differences are detected at runtime by the CLR: http://pastebin.com/4RZ6dZ3S I'm asking why generics are restricted when the checks are done at runtime either way.

like image 773
DBN Avatar asked Nov 04 '14 22:11

DBN


2 Answers

The root of the issue is genericity and verifiability, and a design based on type constraints. The rule that we can't overlap references (pointer) with value types is an implicit, multi-parameter constraint. So, we know the CLR is smart enough to verify this in non-generic cases... why not generic? Sounds attractive.

A correct generic type definition is one that is verifiable to work today for any type that exists (within the constraints) and any that will be defined in the future. [1] CLR via C#, Richter The compiler verifies the open generic type definition on its own, considering any type constraints you specify to narrow the possible type arguments.

In absence of a more specific type constraint, for Foo<T,U>, T and U each represent both the union of all possible value and reference types, and the interface common to all those types (the base System.Object). If we want to make T or U more specific, we can add primary and secondary type constraints. In the latest version of C#, the most specific we can constrain by is a class or an interface. struct or primitive type constraints are not supported.

We cant currently say:

  1. where only struct or value type
  2. where T if T is a sealed type

Ex:

public struct TaggedUnion<T1, T2>
    where T1 : SealedThing   // illegal

so we have no way of defining a generic type that is verifiable to never violate the overlapping rule for all types within T and U. Even if we could constrain by struct, you can still derive a struct with reference fields such that for some type in the future, T<,> wouldn't be correct.

So what we are really asking here is why don't generic types allow implicit type constraints based on code within the class?; explicit layout is an internal implementation detail that imposes restrictions on which combinations of T1 and T2 are legal. In my opinion, that isn't consistent with the design that depends on type constraints. It violates the clean contract of the generic type system as designed. So why even go through the trouble of imposing a type constraint system in the design in the first place, if we intend to break it? We might as well toss it out and replace it with exceptions.

With the current state of things:

  1. Type constraints are visible metadata of the open generic type
  2. Verification of the generic type Foo<T,U> is performed on the open definition F<,> once. For each bound type instance of Foo<t1,u1>, t1 and u1 are checked for type correctness against the constraints. There is no need to reverify the code for the class and methods for Foo<t1,u1>.

All of this is "As Far As I Know"

There is no hard technical reason why every generic type instantiation could not be semantically analyzed for correctness (C++ is evidence of that) but it would seem to break the design in place.

TL;DR

Without breaking or supplementing the existing type constraint design there is no way for this to be verifiable.

Perhaps, combined with appropriate new type constraints, we might see it in the future.

like image 60
codenheim Avatar answered Sep 20 '22 17:09

codenheim


It's specified in ECMA 335 (CLI), partition II, section II.10.1.2:

explicit: The layout of the fields is explicitly provided (§II.10.7). However, a generic type shall not have explicit layout.

You can imagine how it could be awkward - given that the size of a type parameter depends on the type parameter, you could get some decidedly odd effects... a reference field isn't allowed to overlap with a built-in value type or another reference, for example, which would be hard to guarantee as soon as unknown sizes are involved. (I haven't looked into how it works out for 32-bit vs 64-bit references, which have a similar but slightly different issue...)

I suspect the specification could have been written to make some more detailed restrictions - but making it a simple blanket restriction on all generic types is considerably simpler.

like image 41
Jon Skeet Avatar answered Sep 20 '22 17:09

Jon Skeet