Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance between Iterating through IEnumerable<T> and List<T>

Today, I faced a problem with performance while iterating through a list of items. After done some diagnostic, I finally figured out the reason which slowed down performance. It turned out that iterating through an IEnumerable<T> took much more time than iterating through a List<T>. Please help me understand why IEnumerable<T> is slower than List<T>.

UPDATE benchmark context:

I'm using NHibernate to fetch a collection of items from a database into an IEnumerable<T> and sum its property's value. This is just a simple entity without any reference type:

public SimpleEntity
{
    public int Id {get;set}
    public string Name {get;set}
    public decimal Price {get;set}
}

Public Test
{
    void Main()
    {
        //this query get a list of about 200 items
        IEnumerable<SimpleEntity> entities = from entity in Session.Query<SimpleEntity>
                                             select entity;

        decimal value = 0.0;
        foreach(SimpleEntity item in entities)
        {
             //this for loop took 1.5 seconds 
             value += item.Price;
        }

        List<SimpleEntity> lstEntities = entities.ToList();

        foreach(SimpleEntity item in lstEntities)
        {
             //this for loop took less than a milisecond
             value += item.Price;
        }
    }
}
like image 920
Doan Cuong Avatar asked May 08 '14 08:05

Doan Cuong


People also ask

Which is faster IEnumerable or list?

IEnumerable is conceptually faster than List because of the deferred execution. Deferred execution makes IEnumerable faster because it only gets the data when needed. Contrary to Lists having the data in-memory all the time.

Which loop is faster in C#?

The forloop is faster than the foreach loop if the array must only be accessed once per iteration.

What is the difference between IEnumerable and list in C#?

One important difference between IEnumerable and List (besides one being an interface and the other being a concrete class) is that IEnumerable is read-only and List is not. So if you need the ability to make permanent changes of any kind to your collection (add & remove), you'll need List.

How do you traverse an object in C#?

Using this second form, to iterate over all the string properties of an object you could do: foreach (var property in PropertiesOfType<string>(myObject)) { var name = property. Key; var val = property. Value; ... }


2 Answers

Enumerating an IEnumerable<T> is 2 to 3 times slower than enumerating the same List<T> directly. This is due to a subtlety on how C# selects its enumerator for a given type.

List<T> exposes 3 enumerators:

  1. List<T>.Enumerator List<T>.GetEnumerator()
  2. IEnumerator<T> IEnumerable<T>.GetEnumerator()
  3. IEnumerator IEnumerable.GetEnumerator()

When C# compiles a foreach loop, it will select the enumerator in the above order. Note that a type doesn't need to implement IEnumerable or IEnumerable<T> to be enumerable, it just needs a method named GetEnumerator() that returns an enumerator.

Now, List<T>.GetEnumerator() has the advantage of being statically typed which makes all calls to List<T>.Enumerator.get_Current and List<T>.Enumerator.MoveNext() static-bound instead of virtual.

10M iterations (coreclr):

for(int i ...)               73 ms
foreach(... List<T>)        215 ms
foreach(... IEnumerable<T>) 698 ms
foreach(... IEnumerable)   1028 ms
for(int *p ...)              50 ms

10M iterations (Framework):

for(int i ...)              210 ms
foreach(... List<T>)        252 ms
foreach(... IEnumerable<T>) 537 ms
foreach(... IEnumerable)    844 ms
for(int *p ...)             202 ms

Disclaimer

I should point out the actual iteration in a list is rarely the bottleneck. Keep in mind those are hundreds of milliseconds over millions of iterations. Any work in the loop more complicated than a few arithmetic operations will be overwhelmingly costlier than the iteration itself.

like image 50
Coincoin Avatar answered Sep 20 '22 12:09

Coincoin


List<T> is an IEnumerable<T>. When you are iterating through your List<T>, you are performing the same sequence of operations as you are for any other IEnumerable<T>:

  • Get an IEnumerator<T>.
  • Invoke IEnumerator<T>.MoveNext() on your enumerator.
  • Take the IEnumerator<T>.Current element from the IEnumerator interface while MoveNext() returns true.
  • Dispose of the IEnumerator<T>.

What we know about List<T> is that it is an in-memory collection, so the MoveNext() function on its enumerator is going to be very cheap. It looks like your collection gives an enumerator whose MoveNext() method is more expensive, perhaps because it is interacting with some external resource such as a database connection.

When you call ToList() on your IEnumerable<T>, you are running a full iteration of your collection and loading all of the elements into memory with that iteration. This is worth doing if you expect to be iterating through the same collection multiple times. If you expect to iterate through the collection only once, then ToList() is a false economy: all it does is to create an in-memory collection that will later have to be garbage collected.

like image 31
Rob Lyndon Avatar answered Sep 21 '22 12:09

Rob Lyndon