I designed the following test:
var arrayLength=5000;
object[] objArray=new object[arrayLength];
for(var x=0;x<arrayLength;x++)
{
objArray[x]=new object();
}
objArray[4000]=null;
const int TestSize=int.MaxValue;
System.Diagnostics.Stopwatch v= new Stopwatch();
v.Start();
for(var x=0;x<10000;x++)
{
objArray.Contains(null);
}
v.Stop();
objArray.Contains(null).Dump();
v.Elapsed.ToString().Dump("Contains");
//Any ==
v.Reset();
v.Start();
for(var x=0;x<10000;x++)
{
objArray.Any(o=>o==null);
}
v.Stop();
objArray.Any(x=>x==null).Dump();
v.Elapsed.ToString().Dump("Any");
//Any Equals
v.Reset();
v.Start();
for(var x=0;x<10000;x++)
{
objArray.Any(obj=>object.Equals( obj,null));
}
v.Stop();
objArray.Any(obj=>object.Equals( obj,null)).Dump();
v.Elapsed.ToString().Dump("Any");
The results when null is not present:
Contains False 00:00:00.0606484
Any == False 00:00:00.7532898
Any object.Equals False 00:00:00.8431783
When null is present at element 4000:
Contains True 00:00:00.0494515
Any == True 00:00:00.5929247
Any object.Equals True 00:00:00.6700742
When null is present at element 10:
Contains True 00:00:00.0038035
Any == True 00:00:00.0025687
Any True 00:00:00.0033769
So when the object is near the front, Any
is slightly faster; when it's at the back, it's much much slower. Why?
First, Any() without parameters is quicker than Count() as it does not iterate through the whole collection. Second, Any() improves the readability of your code, indicating that the developer just wants to check if there are any items in the collection and isn't concerned with the exact number of items.
Any() is ALWAYS faster than . Count() > 0 ).
If your collection is in the form of an IEnumerable, the Count() method will iterate through all elements, whereas Any() won't have to. So for enumerables, Any() will have a (potentially significant) performance benefit. In your example, however, Pets is an array, and so you would be better off using .
Using Contains in Entity Framework is actually very slow. It's true that it translates into an IN clause in SQL and that the SQL query itself is executed fast. But the problem and the performance bottleneck is in the translation from your LINQ query into SQL.
Any
will have to call a delegate for every element it checks (an extra callvirt
instruction which is unlikely to get inlined by the JIT). Contains
only performs that check. That's why Any
is slower. I suspect the fact that Any
looks faster than contains when the element is seen very early is that the benchmark can't reflect it easily since they are very close. The setup time for the method call is the majority of the work done in that case (rather than the actual searching operation).
The anonymous method:
--- C:\Users\Mehrdad\AppData\Local\Temporary Projects\ConsoleApplication1\Program.cs
Console.WriteLine(s.Any(a => a == 1));
00000000 xor eax,eax
00000002 cmp ecx,1
00000005 sete al
00000008 ret
Relevant part of Enumerable.Any code:
...
00000051 mov edx,eax
00000053 mov rcx,qword ptr [rbx+8]
00000057 call qword ptr [rbx+18h] // calls the anonymous method above
0000005a movzx ecx,al
0000005d test ecx,ecx
...
Any is slower because Contains is tailored to the specific container you're using (Array/List/etc), so it doesn't have the overhead of firing up an IEnumerable, calling MoveNext() all the time, etc.
However, using Any will make refactoring easier since if you change that collection, you can stil use it, so really I'd only change it to Contains if you know via a profiler that this is an important piece of code. And if it is, you should probably end up using a smarter data structure anyways like a HashSet, since both Any and Contains are O(n).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With