Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A definitive guide to API-breaking changes in .NET

Changing a method signature

Kind: Binary-level Break

Languages affected: C# (VB and F# most likely, but untested)

API before change

public static class Foo
{
    public static void bar(int i);
}

API after change

public static class Foo
{
    public static bool bar(int i);
}

Sample client code working before change

Foo.bar(13);

Adding a parameter with a default value.

Kind of Break: Binary-level break

Even if the calling source code doesn't need to change, it still needs to be recompiled (just like when adding a regular parameter).

That is because C# compiles the default values of the parameters directly into the calling assembly. It means that if you don't recompile, you will get a MissingMethodException because the old assembly tries to call a method with less arguments.

API Before Change

public void Foo(int a) { }

API After Change

public void Foo(int a, string b = null) { }

Sample client code that is broken afterwards

Foo(5);

The client code needs to be recompiled into Foo(5, null) at the bytecode level. The called assembly will only contain Foo(int, string), not Foo(int). That's because default parameter values are purely a language feature, the .Net runtime does not know anything about them. (This also explain why default values have to be compile-time constants in C#).


This one was very non-obvious when I discovered it, especially in light of the difference with the same situation for interfaces. It's not a break at all, but it's surprising enough that I decided to include it:

Refactoring class members into a base class

Kind: not a break!

Languages affected: none (i.e. none are broken)

API before change:

class Foo
{
    public virtual void Bar() {}
    public virtual void Baz() {}
}

API after change:

class FooBase
{
    public virtual void Bar() {}
}

class Foo : FooBase
{
    public virtual void Baz() {}
}

Sample code that keeps working throughout the change (even though I expected it to break):

// C++/CLI
ref class Derived : Foo
{
   public virtual void Baz() {{

   // Explicit override    
   public virtual void BarOverride() = Foo::Bar {}
};

Notes:

C++/CLI is the only .NET language that has a construct analogous to explicit interface implementation for virtual base class members - "explicit override". I fully expected that to result in the same kind of breakage as when moving interface members to a base interface (since IL generated for explicit override is the same as for explicit implementation). To my surprise, this is not the case - even though generated IL still specifies that BarOverride overrides Foo::Bar rather than FooBase::Bar, assembly loader is smart enough to substitute one for another correctly without any complaints - apparently, the fact that Foo is a class is what makes the difference. Go figure...


This one is a perhaps not-so-obvious special case of "adding/removing interface members", and I figured it deserves its own entry in light of another case which I'm going to post next. So:

Refactoring interface members into a base interface

Kind: breaks at both source and binary levels

Languages affected: C#, VB, C++/CLI, F# (for source break; binary one naturally affects any language)

API before change:

interface IFoo
{
    void Bar();
    void Baz();
}

API after change:

interface IFooBase 
{
    void Bar();
}

interface IFoo : IFooBase
{
    void Baz();
}

Sample client code that is broken by change at source level:

class Foo : IFoo
{
   void IFoo.Bar() { ... }
   void IFoo.Baz() { ... }
}

Sample client code that is broken by change at binary level;

(new Foo()).Bar();

Notes:

For source level break, the problem is that C#, VB and C++/CLI all require exact interface name in the declaration of interface member implementation; thus, if the member gets moved to a base interface, the code will no longer compile.

Binary break is due to the fact that interface methods are fully qualified in generated IL for explicit implementations, and interface name there must also be exact.

Implicit implementation where available (i.e. C# and C++/CLI, but not VB) will work fine on both source and binary level. Method calls do not break either.


Reordering enumerated values

Kind of break: Source-level/Binary-level quiet semantics change

Languages affected: all

Reordering enumerated values will keep source-level compatibility as literals have the same name, but their ordinal indices will be updated, which can cause some kinds of silent source-level breaks.

Even worse is the silent binary-level breaks that can be introduced if client code is not recompiled against the new API version. Enum values are compile-time constants and as such any uses of them are baked into the client assembly's IL. This case can be particularly hard to spot at times.

API Before Change

public enum Foo
{
   Bar,
   Baz
}

API After Change

public enum Foo
{
   Baz,
   Bar
}

Sample client code that works but is broken afterwards:

Foo.Bar < Foo.Baz

This one is really a very rare thing in practice, but nonetheless a surprising one when it happens.

Adding new non-overloaded members

Kind: source level break or quiet semantics change.

Languages affected: C#, VB

Languages not affected: F#, C++/CLI

API before change:

public class Foo
{
}

API after change:

public class Foo
{
    public void Frob() {}
}

Sample client code that is broken by change:

class Bar
{
    public void Frob() {}
}

class Program
{
    static void Qux(Action<Foo> a)
    {
    }

    static void Qux(Action<Bar> a)
    {
    }

    static void Main()
    {
        Qux(x => x.Frob());        
    }
}

Notes:

The problem here is caused by lambda type inference in C# and VB in presence of overload resolution. A limited form of duck typing is employed here to break ties where more than one type matches, by checking whether the body of the lambda makes sense for a given type - if only one type results in compilable body, that one is chosen.

The danger here is that client code may have an overloaded method group where some methods take arguments of his own types, and others take arguments of types exposed by your library. If any of his code then relies on type inference algorithm to determine the correct method based solely on presence or absence of members, then adding a new member to one of your types with the same name as in one of the client's types can potentially throw inference off, resulting in ambiguity during overload resolution.

Note that types Foo and Bar in this example are not related in any way, not by inheritance nor otherwise. Mere use of them in a single method group is enough to trigger this, and if this occurs in client code, you have no control over it.

The sample code above demonstrates a simpler situation where this is a source-level break (i.e. compiler error results). However, this can also be a silent semantics change, if the overload that was chosen via inference had other arguments which would otherwise cause it to be ranked below (e.g. optional arguments with default values, or type mismatch between declared and actual argument requiring an implicit conversion). In such scenario, the overload resolution will no longer fail, but a different overload will be quietly selected by the compiler. In practice, however, it is very hard to run into this case without carefully constructing method signatures to deliberately cause it.


Convert an implicit interface implementation into an explicit one.

Kind of Break: Source and Binary

Languages Affected: All

This is really just a variation of changing a method's accessibility - its just a little more subtle since it's easy to overlook the fact that not all access to an interface's methods are necessarily through a reference to the type of the interface.

API Before Change:

public class Foo : IEnumerable
{
    public IEnumerator GetEnumerator();
}

API After Change:

public class Foo : IEnumerable
{
    IEnumerator IEnumerable.GetEnumerator();
}

Sample Client code that works before change and is broken afterwards:

new Foo().GetEnumerator(); // fails because GetEnumerator() is no longer public