Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Enumeration Performance

consider the following code example:

IEnumerable<Y> myEnumeration = *some linq stuff into a datatable*

If (myEnumeration.Count() > X)
{
    foreach(Y bla in myEnumeration)
    {
       // do something
    }
 }

Will this result in enumerating 2 times? The count-Call + the foreach? If so, is there any way to avoid one of the enumerations?

Thanks in advance

Edited myEnumeration.Count -> myEnumeration.Count() (extension method)

like image 302
DanielG Avatar asked Jan 23 '14 07:01

DanielG


2 Answers

I put this code to LinqPad to let me show the generated SQL:

IEnumerable<MyTable> myEnumeration = MyTable;

if (myEnumeration.Count() > 1)
{
    foreach(MyTable bla in myEnumeration)
    {
       // do something
    }
 }

The generated SQL is the following:

SELECT * FROM [MyTable] AS [t0]
GO

SELECT *
FROM [MyTable] AS [t0]

So yes, the data will be retrieved two times from Database. Consider

List<Y> myEnumeration = *some linq stuff into a datatable* **.ToList();**
like image 189
Dannydust Avatar answered Sep 22 '22 11:09

Dannydust


Yes, that will give you two database calls. Count() will execute query like:

 SELECT COUNT(1) FROM Table WHERE Blah

And then GetEnumerator() will execute query which gets all required fields:

 SELECT Id, Foo, Bar FROM Table WHERE Blah

Actually there is no one correct answer. You should consider on:

  • number of results you usually get (is it millions of entities, or just several dozen)
  • number of required entities (is it several entities or hundreds of them)
  • what happens more often - is number of required entities present in resultset or not
  • is it real performance issue, or this method will be called once a week

Depending on that you should make your decision.

  • if it's not performance issue, then simply make two database calls
  • if number of returned items is not huge and it's more likely they will contain required number of items, then just dump query to list
  • if number of items is pretty big and you don't want to dump them all, then you can use extension method below, will check if there is at least N items in resultset without saving all sequence to list. But here you should consider what will be faster - dumping N required items, or making database call to check items count.

Here is extension:

public static IEnumerable<T> TakeIfMoreThan<T>(
    this IEnumerable<T> source, int count)
{
    List<T> buffer = new List<T>(count);

    using (var iterator = source.GetEnumerator())
    {
         while (buffer.Count < count && iterator.MoveNext())
                buffer.Add(iterator.Current);

         if (buffer.Count < count)
         {
              yield break;
         }
         else
         {
            foreach (var item in buffer)
                yield return item;

            buffer.Clear();
            while (iterator.MoveNext())
                yield return iterator.Current;
        }            
    }
}

Usage is simple:

foreach(Y bla in myEnumeration.TakeIfMoreThan(X))
{
   // do something
}

Thus you will not need to dump all query results in in-memory list. You will use single database call (but it will query for all item fields). And you will not enumerate items if there is less than required number of results.

like image 31
Sergey Berezovskiy Avatar answered Sep 22 '22 11:09

Sergey Berezovskiy