When using the extension method of IEnumerable<T>
Count()
, an array is at least two times slower than a list.
Function Count() List<int> 2,299 int[] 6,903
From where did the difference comes?
I understand that both are calling the Count
property of ICollection
:
If the type of source implements ICollection, that implementation is used to obtain the count of elements. Otherwise, this method determines the count.
For the list it returns List<T>.Count
, and for array, Array.Length
. Moreover, Array.Length
is supposed to be faster than List<T>.Count
.
Benchmark:
class Program { public const long Iterations = (long)1e8; static void Main() { var list = new List<int>(){1}; var array = new int[1]; array[0] = 1; var results = new Dictionary<string, TimeSpan>(); results.Add("List<int>", Benchmark(list, Iterations)); results.Add("int[]", Benchmark(array, Iterations)); Console.WriteLine("Function".PadRight(30) + "Count()"); foreach (var result in results) { Console.WriteLine("{0}{1}", result.Key.PadRight(30), Math.Round(result.Value.TotalSeconds, 3)); } Console.ReadLine(); } public static TimeSpan Benchmark(IEnumerable<int> source, long iterations) { var countWatch = new Stopwatch(); countWatch.Start(); for (long i = 0; i < iterations; i++) source.Count(); countWatch.Stop(); return countWatch.Elapsed; } }
Edit:
leppie and Knaģis answers are pretty amazing, but I want to add a remark.
As Jon Skeet said:
There are effectively two equivalent blocks, just testing for different collection interface types, and using whichever one it finds first (if any). I don't know whether the .NET implementation tests for ICollection or ICollection< T > first - I could test it by implementing both interfaces but returning different counts from each, of course, but that's probably overkill. It doesn't really matter for well-behaved collections other than the slight performance difference - we want to test the "most likely" interface first, which I believe is the generic one.
The generic one could be the most likely to happens, but if you invert the two, ie call the non generic cast before the generic one, Array.Count() becomes a little faster than List.Count(). On the other hand, non generic version is slower for List.
Good to know if anyone want to call Count()
in an 1e8 iterations loop!
Function ICollection<T> Cast ICollection Cast List 1,268 1,738 Array 5,925 1,683
The reason is that Enumerable.Count<T>()
performs a cast to ICollection<T>
to retrieve the count both from the list and the array.
Using this sample code:
public static int Count<TSource>(IEnumerable<TSource> source) { ICollection<TSource> collection = source as ICollection<TSource>; if (collection != null) { return 1; // collection.Count; } }
you can determine that the cast takes much longer for the array, in fact most of the time taken for counting is from this cast:
Function Count() List<int> 1,575 int[] 5,069
The key might be this statement from the documentation (emphasis mine):
In the .NET Framework version 2.0, the Array class implements the System.Collections.Generic.IList, System.Collections.Generic.ICollection, and System.Collections.Generic.IEnumerable generic interfaces. The implementations are provided to arrays at run time, and therefore are not visible to the documentation build tools. As a result, the generic interfaces do not appear in the declaration syntax for the Array class, and there are no reference topics for interface members that are accessible only by casting an array to the generic interface type (explicit interface implementations).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With