Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which method performs better: .Any() vs .Count() > 0?

in the System.Linq namespace, we can now extend our IEnumerable's to have the Any() and Count() extension methods.

I was told recently that if i want to check that a collection contains 1 or more items inside it, I should use the .Any() extension method instead of the .Count() > 0 extension method because the .Count() extension method has to iterate through all the items.

Secondly, some collections have a property (not an extension method) that is Count or Length. Would it be better to use those, instead of .Any() or .Count()?

yea / nae?

like image 495
Pure.Krome Avatar asked Nov 20 '08 12:11

Pure.Krome


People also ask

Is any faster than count?

Across the array sizes the Any is roughly 1/3 faster than using Count .

Why use Any instead of Count?

If your collection is in the form of an IEnumerable, the Count() method will iterate through all elements, whereas Any() won't have to. So for enumerables, Any() will have a (potentially significant) performance benefit. In your example, however, Pets is an array, and so you would be better off using .

What is the difference between count and count () in C#?

So why do we care about the difference between Count and Count()? One simply reads a value in memory to determine the count of the elements in a collection and the other iterates over the entire collection in memory to determine the count of the number of items.


2 Answers

If you are starting with something that has a .Length or .Count (such as ICollection<T>, IList<T>, List<T>, etc) - then this will be the fastest option, since it doesn't need to go through the GetEnumerator()/MoveNext()/Dispose() sequence required by Any() to check for a non-empty IEnumerable<T> sequence.

For just IEnumerable<T>, then Any() will generally be quicker, as it only has to look at one iteration. However, note that the LINQ-to-Objects implementation of Count() does check for ICollection<T> (using .Count as an optimisation) - so if your underlying data-source is directly a list/collection, there won't be a huge difference. Don't ask me why it doesn't use the non-generic ICollection...

Of course, if you have used LINQ to filter it etc (Where etc), you will have an iterator-block based sequence, and so this ICollection<T> optimisation is useless.

In general with IEnumerable<T> : stick with Any() ;-p

like image 95
Marc Gravell Avatar answered Sep 17 '22 17:09

Marc Gravell


Note: I wrote this answer when Entity Framework 4 was actual. The point of this answer was not to get into trivial .Any() vs .Count() performance testing. The point was to signal that EF is far from perfect. Newer versions are better... but if you have part of code that's slow and it uses EF, test with direct TSQL and compare performance rather than relying on assumptions (that .Any() is ALWAYS faster than .Count() > 0).


While I agree with most up-voted answer and comments - especially on the point Any signals developer intent better than Count() > 0 - I've had situation in which Count is faster by order of magnitude on SQL Server (EntityFramework 4).

Here is query with Any that thew timeout exception (on ~200.000 records):

con = db.Contacts.     Where(a => a.CompanyId == companyId && a.ContactStatusId <= (int) Const.ContactStatusEnum.Reactivated         && !a.NewsletterLogs.Any(b => b.NewsletterLogTypeId == (int) Const.NewsletterLogTypeEnum.Unsubscr)     ).OrderBy(a => a.ContactId).     Skip(position - 1).     Take(1).FirstOrDefault(); 

Count version executed in matter of milliseconds:

con = db.Contacts.     Where(a => a.CompanyId == companyId && a.ContactStatusId <= (int) Const.ContactStatusEnum.Reactivated         && a.NewsletterLogs.Count(b => b.NewsletterLogTypeId == (int) Const.NewsletterLogTypeEnum.Unsubscr) == 0     ).OrderBy(a => a.ContactId).     Skip(position - 1).     Take(1).FirstOrDefault(); 

I need to find a way to see what exact SQL both LINQs produce - but it's obvious there is a huge performance difference between Count and Any in some cases, and unfortunately it seems you can't just stick with Any in all cases.

EDIT: Here are generated SQLs. Beauties as you can see ;)

ANY:

 exec sp_executesql N'SELECT TOP (1)  [Project2].[ContactId] AS [ContactId],  [Project2].[CompanyId] AS [CompanyId],  [Project2].[ContactName] AS [ContactName],  [Project2].[FullName] AS [FullName],  [Project2].[ContactStatusId] AS [ContactStatusId],  [Project2].[Created] AS [Created] FROM ( SELECT [Project2].[ContactId] AS [ContactId], [Project2].[CompanyId] AS [CompanyId], [Project2].[ContactName] AS [ContactName], [Project2].[FullName] AS [FullName], [Project2].[ContactStatusId] AS [ContactStatusId], [Project2].[Created] AS [Created], row_number() OVER (ORDER BY [Project2].[ContactId] ASC) AS [row_number]     FROM ( SELECT          [Extent1].[ContactId] AS [ContactId],          [Extent1].[CompanyId] AS [CompanyId],          [Extent1].[ContactName] AS [ContactName],          [Extent1].[FullName] AS [FullName],          [Extent1].[ContactStatusId] AS [ContactStatusId],          [Extent1].[Created] AS [Created]         FROM [dbo].[Contact] AS [Extent1]         WHERE ([Extent1].[CompanyId] = @p__linq__0) AND ([Extent1].[ContactStatusId] <= 3) AND ( NOT EXISTS (SELECT              1 AS [C1]             FROM [dbo].[NewsletterLog] AS [Extent2]             WHERE ([Extent1].[ContactId] = [Extent2].[ContactId]) AND (6 = [Extent2].[NewsletterLogTypeId])         ))     )  AS [Project2] )  AS [Project2] WHERE [Project2].[row_number] > 99 ORDER BY [Project2].[ContactId] ASC',N'@p__linq__0 int',@p__linq__0=4 

COUNT:

 exec sp_executesql N'SELECT TOP (1)  [Project2].[ContactId] AS [ContactId],  [Project2].[CompanyId] AS [CompanyId],  [Project2].[ContactName] AS [ContactName],  [Project2].[FullName] AS [FullName],  [Project2].[ContactStatusId] AS [ContactStatusId],  [Project2].[Created] AS [Created] FROM ( SELECT [Project2].[ContactId] AS [ContactId], [Project2].[CompanyId] AS [CompanyId], [Project2].[ContactName] AS [ContactName], [Project2].[FullName] AS [FullName], [Project2].[ContactStatusId] AS [ContactStatusId], [Project2].[Created] AS [Created], row_number() OVER (ORDER BY [Project2].[ContactId] ASC) AS [row_number]     FROM ( SELECT          [Project1].[ContactId] AS [ContactId],          [Project1].[CompanyId] AS [CompanyId],          [Project1].[ContactName] AS [ContactName],          [Project1].[FullName] AS [FullName],          [Project1].[ContactStatusId] AS [ContactStatusId],          [Project1].[Created] AS [Created]         FROM ( SELECT              [Extent1].[ContactId] AS [ContactId],              [Extent1].[CompanyId] AS [CompanyId],              [Extent1].[ContactName] AS [ContactName],              [Extent1].[FullName] AS [FullName],              [Extent1].[ContactStatusId] AS [ContactStatusId],              [Extent1].[Created] AS [Created],              (SELECT                  COUNT(1) AS [A1]                 FROM [dbo].[NewsletterLog] AS [Extent2]                 WHERE ([Extent1].[ContactId] = [Extent2].[ContactId]) AND (6 = [Extent2].[NewsletterLogTypeId])) AS [C1]             FROM [dbo].[Contact] AS [Extent1]         )  AS [Project1]         WHERE ([Project1].[CompanyId] = @p__linq__0) AND ([Project1].[ContactStatusId] <= 3) AND (0 = [Project1].[C1])     )  AS [Project2] )  AS [Project2] WHERE [Project2].[row_number] > 99 ORDER BY [Project2].[ContactId] ASC',N'@p__linq__0 int',@p__linq__0=4 

Seems that pure Where with EXISTS works much worse than calculating Count and then doing Where with Count == 0.

Let me know if you guys see some error in my findings. What can be taken out of all this regardless of Any vs Count discussion is that any more complex LINQ is way better off when rewritten as Stored Procedure ;).

like image 22
nikib3ro Avatar answered Sep 16 '22 17:09

nikib3ro