I've encountered with one performance problem that I can't quite understand. I know how to fix it but I don't understand Why that happens. It's just for fun!
Let's talk code. I simplified the code as much as I could to reproduce the issue.
Suppose we have a generic class. It has an empty list inside and does something with T
in constructor. It has Run
method that calls an IEnumerable<T>
method on the list, e.g. Any()
.
public class BaseClass<T> { private List<T> _list = new List<T>(); public BaseClass() { Enumerable.Empty<T>(); // or Enumerable.Repeat(new T(), 10); // or even new T(); // or foreach (var item in _list) {} } public void Run() { for (var i = 0; i < 8000000; i++) { if (_list.Any()) // or if (_list.Count() > 0) // or if (_list.FirstOrDefault() != null) // or if (_list.SingleOrDefault() != null) // or other IEnumerable<T> method { return; } } } }
Then we have a derived class which is empty:
public class DerivedClass : BaseClass<object> { }
Let's measure the performance of running ClassBase<T>.Run
method from both classes. Accessing from derived type is 4 times slower that from base class. And I can't understand why that happens. Compiled in Release mode, result is the same with warm up. It happens on .NET 4.5 only.
public class Program { public static void Main() { Measure(new DerivedClass()); Measure(new BaseClass<object>()); } private static void Measure(BaseClass<object> baseClass) { var sw = Stopwatch.StartNew(); baseClass.Run(); sw.Stop(); Console.WriteLine(sw.ElapsedMilliseconds); } }
Full listing on gist
Advertisements. Generics allow you to define the specification of the data type of programming elements in a class or a method, until it is actually used in the program. In other words, generics allow you to write a class or method that can work with any data type.
A generic type definition is a class, structure, or interface declaration that functions as a template, with placeholders for the types that it can contain or use. For example, the System. Collections.
Advantages of Generics: Generics provide type safety without the overhead of multiple implementations. Generics eliminates boxing and unboxing. There is no need to write code to test for the correct data type because it is enforced at compile time.
Update:
There's an answer from the CLR team on Microsoft Connect
It is related to dictionary lookups in shared generics code. The heuristic in runtime and JIT do not work well for this particular test. We will take a look what can be done about it.
In the meantime, you can workaround it by adding two dummy methods to the BaseClass (do not even need to be called). It will cause the heuristic to work as one would expect.
Original:
That's JIT fail.
Can be fixed by this crazy thing:
public class BaseClass<T> { private List<T> _list = new List<T>(); public BaseClass() { Enumerable.Empty<T>(); // or Enumerable.Repeat(new T(), 10); // or even new T(); // or foreach (var item in _list) {} } public void Run() { for (var i = 0; i < 8000000; i++) { if (_list.Any()) { return; } } } public void Run2() { for (var i = 0; i < 8000000; i++) { if (_list.Any()) { return; } } } public void Run3() { for (var i = 0; i < 8000000; i++) { if (_list.Any()) { return; } } } }
Note that Run2()/Run3() are not called from anywhere. But if you comment out Run2 or Run3 methods - you'll get performance penalty as before.
There's something related to stack alignment or to the size of method table, I guess.
P.S. You can replace
Enumerable.Empty<T>(); // with var x = new Func<IEnumerable<T>>(Enumerable.Empty<T>);
still the same bug.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With