Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why less specific overload taken over more sepcific overriden method with same name? [duplicate]

I have pretty simple following code:

public class Base
{
    public virtual void Foo(int x)
    {
        Console.WriteLine("Base.Foo(int)");
    }
}
public class Derived : Base
{
    public override void Foo(int x)
    {
        Console.WriteLine("Derived.Foo(int)");
    }
    public void Foo(object o)
    {
        Console.WriteLine("Derived.Foo(object)");
    }
}

Then in program I write:

Derived d = new Derived();
int i = 1;
d.Foo(i); //prints Derived.Foo(object)

I do not understand and could not find on the web why Derived.Foo(object) is called? How can I make sure that Derived.Foo(int) is called?

like image 238
Radoslaw Jurewicz Avatar asked Jan 08 '18 20:01

Radoslaw Jurewicz


1 Answers

I do not understand and could not find on the web why Derived.Foo(object) is called?

The rule is that more specific is better than more general. But in this case we have a conflict: int is more specific than object, so Base.Foo(int) should win, but declared-in-Derived is more specific than declared-in-Base, so Derived.Foo(object) should win.

Another way to think about it is: "this" is logically an invisible argument, so we can think of the signature of Foo(int) as taking an invisible Base and Foo(object) as taking an invisible Derived. Overload resolution should prefer int to object, but should also prefer Derived to Base, and we have a contradiction.

C# must resolve this conflict.

The rule in C# is an applicable method declared by a derived class is always better than any method declared by a base class. Moreover: an overridden virtual method is considered to be a method declared in the original class that declares it, not in the class that overrides it.

Though many people find this counterintuitive at first, this was a carefully considered decision that is designed to (1) ensure that derived class authors -- who have more information about correct behaviour of the derived class than base class authors do! -- can control the behaviour of the derived class, and (2) mitigate brittle base class failures.

There are several brittle base class failures that it mitigates; in particular, one of them is "a program changes its behaviour depending on subtle details of where an override happens in a class hierarchy". You do not ever want to be in a situation where your derived class code breaks because someone else three base classes deep moved an overload from one place to another in their hierarchy. Those details should be invisible implementation details of the base class, so C# mitigates this failure with its rule that overload resolution considers only where a virtual method is originally declared, and never where it is overridden.

I have many articles on this and other subtle decisions in overload resolution. Here are a couple of useful links; consider browsing my WordPress and MSDN blogs for more articles on related subjects.

https://blogs.msdn.microsoft.com/ericlippert/2007/09/04/future-breaking-changes-part-three/

https://ericlippert.com/2013/12/23/closer-is-better/

How can I make sure that Derived.Foo(int) is called?

There are several possibilities, in order from bad idea to good idea:

First idea: make a non-virtual method. (Bad!)

public class Derived : Base
{
  public new void Foo(int x)
  { 
    Console.WriteLine("Derived.Foo(int)");
  }
  public void Foo(object o)
  {
    Console.WriteLine("Derived.Foo(object)");
  }
}

Now Derived.Foo(int) is called because there is no longer a conflict between base and derived. There are two applicable methods in the derived class, and the one that takes an int is plainly better.

Note that we no longer have virtual overloading here, so a call to Foo via Base will call the base class version. This seems bad! This is working against the design given to you by the author of the base class, who might be relying on you doing a correct virtual override.

Second idea: make the caller work around the problem you created. (Also bad!)

Derived d = new Derived();
int i = 1;
((Base)d).Foo(i);

Since Foo is virtual the derived method will ultimately be called. But this requires that the caller know that you implemented a little "gotcha". This is a trap; don't make your users fall into a trap.

Old timer Microsoft programmers like me call these API traps "candy machine interfaces". See https://blogs.msdn.microsoft.com/ericlippert/2008/09/08/high-maintenance/

The interface naturally leads you to call it the wrong way. Don't impose that on your users.

Third idea: You implemented this mess; you fix it. (Good)

You asked how to ensure that Foo(int) is called. But you can't! So you have to make Foo(object) do the right thing.

public void Foo(object o)
{
    if (o is int)
        ((Base)this).Foo((int)o));
    else
        Console.WriteLine("Derived.Foo(object)");
}

When you wrote Foo(object) you said Derived.Foo(object) knows how to handle any object 100% perfectly correctly without any help from overload resolution. So implement those semantics; that's what you signed up for when you wrote this signature.

Fourth idea: If it hurts when you do that, don't do that. (Best)

Solve the problem by not creating it in the first place. Simply never make a more general method in a derived class; it's confusing and almost always wrong. Move the more general method into the base class. Or find another design entirely.

FOLLOW UP:

A commenter asks what if we had

public static void Bar(Action<object> a) { } 
public static void Bar(Action<int> a) { } 

And a call:

    Bar(d.Foo);

Now what happens?

This is an ambiguity error. Let's see why. It is subtle!


UPDATE: The analysis below is based on my understanding of C# overload resolution as I left it in November of 2012 when I left Microsoft. I see from the Roslyn sources that the precise scenario described here has motivated a subtle change in overload resolution rules, documented here: https://github.com/dotnet/roslyn/issues/6560

The analysis which follows should therefore be considered to be applicable to C# 5 and maybe 6, but not C# 7. I don't know what the precise rule is for C# 7. Consult the Roslyn sources and github issues for the details!

But wait, it gets worse. In the relevant diff, Aleksey notes in the comments that the change was motivated by backwards compat issues, which means that it is very possible that the change was motivated to keep C# compatible with an existing spec violation bug.

Thus the analysis below might not even be valid for C# 5, since it presumes that the compiler implements the specification.

Plainly this is all a bit of a mess. Proceed with caution when attempting to reason about overload resolution edge cases in C#.


First off, is d.Foo convertible to Action<object>? Yes. The rule is: if we did overload resolution on object x = default(object); d.Foo(x), would overload resolution succeed? Yes, it would, and it would choose Derived.Foo(object). Therefore Bar(Action<object>) is applicable.

Second, is d.Foo convertible to Action<int>? Yes. Again, if we did overload resolution on int x = default(int); d.Foo(x); would overload resolution succeed? Yes it would, and it would produce Derived.Foo(object) again. Therefore Bar(Action<int>) is applicable.

Now hold on a second here. It would be illegal to use Derived.Foo(object) as a delegate of type Action<int> so why does there exist a conversion?

This is one of the most subtle and controversial points of C# design; Mads and I agonized over this during the design of C# 3. There are conversions that are declared to exist but are illegal to use, and this is one of them. As the spec oh-so-clearly says:

Note that the existence of an implicit conversion from E to D does not guarantee that the compile-time application of the conversion will succeed without error.

Wow, C#. Just... wow.

So. We now have two applicable overloads and must choose the unique best of the two. Is there a unique best? No. Action<int> is not convertible to or from Action<object>, so neither is more specific. Thus this is an error.

You could make the argument -- and believe me, many have -- that we should say that since we would get an error if we had no Bar(Action<object>), then Bar(Action<int>) should be inapplicable, and therefore Bar(Action<object>) wins. Though I am sympathetic to that argument, remember what we are reasoning about here: we are reasoning about a crazy situation that ought not to arise in the first place. Overload resolution needs to give sensible answers in common code; that it sometimes gives crazy answers in crazy code is unfortunate, but not a big priority for the design team. Moreover, a good design principle of C# is "when the compiler can't easily figure out what the results of overload resolution are, backtracking to find any possible solution that works is probably a bad idea". In short "if you can't figure it out, guess" is a design principle of JavaScript and Visual Basic, not C#. In C# the design principle is "if it's ambiguous, alert the developer that they have a design problem".

There are some additional subtleties here if instead of a method group we pass an equivalent lambda, but unless you have a specific question about those I won't go into those details.

like image 114
Eric Lippert Avatar answered Sep 19 '22 01:09

Eric Lippert