Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can't the meaning of a base class specification recursively depend on itself in C#?

The following piece of C# code does not compile:

public class A
{
    public interface B { }
}              
public class C
    : A,
      C.B // Error given here: The type name 'B' does not exist in the type 'C'.
{ }

public class D : C.B // Compiles without problems if we comment out 'C.B' above.
{ }

This behaviour is correct according to the C# 4.0 specification (paragraph 10.1.4.1):

While determining the meaning of the direct base class specification A of a class B, the direct base class of B is temporarily assumed to be object. Intuitively this ensures that the meaning of a base class specification cannot recursively depend on itself.

My question is: why isn't this behaviour allowed?

Intellisense doesn't have a problem with it - although I know that doesn't say much, after witnessing Visual Studio crash when Intellisense tries to make sense of some evil class combination with variant generics.

Searching the internet for the above quote from the specification yields nothing, so I'm guessing this hasn't been brought up yet anywhere.

Why do I care? I designed the following piece of code:

// The next three classes should really be interfaces,
// but I'm going to override a method later on to prove my point.
// This is a container class, that does nothing except contain two classes.
public class IBagContainer<Bag, Pointer>
    where Bag : IBagContainer<Bag, Pointer>.IBag
    where Pointer : IBagContainer<Bag, Pointer>.IPointer
{
    // This could be an interface for any type of collection.
    public class IBag
    {
        // Insert some object, and return a pointer object to it.
        // The pointer object could be used to speed up certain operations,
        // so you don't have to search for the object again.
        public virtual Pointer Insert(object o) { return null; }
    }
    // This is a pointer type that points somewhere insice an IBag.
    public class IPointer
    {
        // Returns the Bag it belongs to.
        public Bag GetSet() { return null; }
    }
}
// This is another container class, that implements a specific type of IBag.
public class BinarySearchTreeContainer<Tree, Node> : IBagContainer<Tree, Node>
    where Tree : BinarySearchTreeContainer<Tree, Node>.BinarySearchTree
    where Node : BinarySearchTreeContainer<Tree, Node>.BinarySearchTreeNode
{
    // This is your basic binary search tree.
    public class BinarySearchTree : IBagContainer<Tree, Node>.IBag
    {
        // We can search for objects we've put in the tree.
        public Node Search(object o) { return null; }

        // See what I did here? Insert doesn't return a Pointer or IPointer,
        // it returns a Node! Covariant return types!
        public override Node Insert(object o) { return null; }
    }
    // A node in the binary tree. This is a basic example of an IPointer.
    public class BinarySearchTreeNode : IBagContainer<Tree, Node>.IPointer
    {
        // Moar covariant return types!
        public override Tree GetSet() { return null; }
        // If we maintain next and prev pointers in every node,
        // these operations are O(1). You can't expect every IBag
        // to support these operations.
        public Node GetNext() { return null; }
        public Node GetPrev() { return null; }
    }
}

Lo behold, we have achieved covariant return types! There is one small detail however.

Try instantiating a BinarySearchTree. To do that, we need to specify BinarySearchTreeContainer.BinarySearchTree for some suitable Tree and Node classes. For Tree, we'd like to use BinarySearchTree, for which we'd need to specify BinarySearchTreeContainer.BinarySearchTree... And we're stuck.

This is essentially the curiously recurring template pattern (CRTP). Unfortunately, we can't fix it as in CRTP:

public class BinarySearchTreeContainer
    : BinarySearchTreeContainer
        <BinarySearchTreeContainer.BinarySearchTree,
         BinarySearchTreeContainer.BinarySearchTreeNode> { }
public class IBagContainer
    : IBagContainer
        <IBagContainer.IBag,
         IBagContainer.IPointer> { }

(...)
BinarySearchTreeContainer.BinarySearchTree tree
    = new BinarySearchTreeContainer.BinarySearchTree();
tree.Search(null);
IBagContainer.IBag bag = tree; // No cast!
//bag.Search(null); // Invalid!
//BinarySearchTreeContainer.BinarySearchTreeNode node
//    = bag.Insert(null); // Invalid!

And we're back to my original question: the top two class definitions are not allowed by the C# specification. If this class definition was allowed, my binary search trees would be usable. Right now, they merely compile: they can't be used.

like image 214
Alex ten Brink Avatar asked Jan 07 '12 00:01

Alex ten Brink


1 Answers

I have struggled with the issues you bring up for countless hours over the last few years. A detailed discussion of all the issues you raise would take me several hours to type up, so I'll just summarrize:

First, it turns out that even with that "temporarily assumed to be object" clause that Mads and I added to try and tighten up that section of the spec, this section of the spec is still not well-founded. The "how to bind a name to a type" bit of the specification assumes that all nesting and inheritance relationships are known and consistent at the point at which the lookup happens, but of course obviously that cannot be the case since the whole reason we are doing a name lookup in the first place is to determine a base type. If I had my notes with me I could give you a number of examples of crazy type hierarchies where combinations of generics, nestings, interfaces and base classes put the compiler into situations where how you determine what a given name means depends on the order in which you are figuring out the base classes.

Obviously that is not a good place to be. We do not want the meaning of a C# program to differ when you re-order the classes in a file!

Second, we are constrained by what can be represented in metadata.

Third, historically we have been constrained by what can be efficiently emitted into metadata. Previous versions of the metadata emitters had performance or correctness issues if you tried to emit derived types before base types or inner types before outer types. (I attempted in C# 4 to address this by writing a topological sorter that would find an efficient ordering if any existed, but the change proved to be sufficiently complicated and dangerous that we decided to not take the change until Roslyn. In Roslyn we are using a completely different emitter.)

Fourth, it is rare that these sorts of topologies turn up in real production code; you are apparently an exception to that rule.

Fifth, one of our major goals for the language is to make it a "pit of quality" language, where the features of the language lead one to write programs that are both correct and comprehensible. Allowing the sorts of crazy "curiously recurring" patterns that you see in C++ templates is explicitly not a goal of the C# language team. We're not interested in providing a theoretically complete type system; we're interested in making it easy to represent that an Employee is a kind of Person.

All of these factors are working against making circularities in base class and nested class relationships more legal. As much as I personally would enjoy the challenge of coming up with a well-founded system for resolving circularities in base types in a manner that does not break any existing code, it is not a high enough priority; we have a long list of stuff we want to improve for Roslyn, and the base class resolution algorithm is far from the top of that list.

like image 174
Eric Lippert Avatar answered Oct 04 '22 20:10

Eric Lippert