Today, I faced a problem with performance while iterating through a list of items. After done some diagnostic, I finally figured out the reason which slowed down performance. It turned out that iterating through an IEnumerable<T>
took much more time than iterating through a List<T>
. Please help me understand why IEnumerable<T>
is slower than List<T>
.
UPDATE benchmark context:
I'm using NHibernate to fetch a collection of items from a database into an IEnumerable<T>
and sum its property's value. This is just a simple entity without any reference type:
public SimpleEntity
{
public int Id {get;set}
public string Name {get;set}
public decimal Price {get;set}
}
Public Test
{
void Main()
{
//this query get a list of about 200 items
IEnumerable<SimpleEntity> entities = from entity in Session.Query<SimpleEntity>
select entity;
decimal value = 0.0;
foreach(SimpleEntity item in entities)
{
//this for loop took 1.5 seconds
value += item.Price;
}
List<SimpleEntity> lstEntities = entities.ToList();
foreach(SimpleEntity item in lstEntities)
{
//this for loop took less than a milisecond
value += item.Price;
}
}
}
IEnumerable is conceptually faster than List because of the deferred execution. Deferred execution makes IEnumerable faster because it only gets the data when needed. Contrary to Lists having the data in-memory all the time.
The forloop is faster than the foreach loop if the array must only be accessed once per iteration.
One important difference between IEnumerable and List (besides one being an interface and the other being a concrete class) is that IEnumerable is read-only and List is not. So if you need the ability to make permanent changes of any kind to your collection (add & remove), you'll need List.
Using this second form, to iterate over all the string properties of an object you could do: foreach (var property in PropertiesOfType<string>(myObject)) { var name = property. Key; var val = property. Value; ... }
Enumerating an IEnumerable<T>
is 2 to 3 times slower than enumerating the same List<T>
directly. This is due to a subtlety on how C# selects its enumerator for a given type.
List<T>
exposes 3 enumerators:
List<T>.Enumerator List<T>.GetEnumerator()
IEnumerator<T> IEnumerable<T>.GetEnumerator()
IEnumerator IEnumerable.GetEnumerator()
When C# compiles a foreach
loop, it will select the enumerator in the above order. Note that a type doesn't need to implement IEnumerable
or IEnumerable<T>
to be enumerable, it just needs a method named GetEnumerator()
that returns an enumerator.
Now, List<T>.GetEnumerator()
has the advantage of being statically typed which makes all calls to List<T>.Enumerator.get_Current
and List<T>.Enumerator.MoveNext()
static-bound instead of virtual.
10M iterations (coreclr):
for(int i ...) 73 ms
foreach(... List<T>) 215 ms
foreach(... IEnumerable<T>) 698 ms
foreach(... IEnumerable) 1028 ms
for(int *p ...) 50 ms
10M iterations (Framework):
for(int i ...) 210 ms
foreach(... List<T>) 252 ms
foreach(... IEnumerable<T>) 537 ms
foreach(... IEnumerable) 844 ms
for(int *p ...) 202 ms
Disclaimer
I should point out the actual iteration in a list is rarely the bottleneck. Keep in mind those are hundreds of milliseconds over millions of iterations. Any work in the loop more complicated than a few arithmetic operations will be overwhelmingly costlier than the iteration itself.
List<T>
is an IEnumerable<T>
. When you are iterating through your List<T>
, you are performing the same sequence of operations as you are for any other IEnumerable<T>
:
IEnumerator<T>
.IEnumerator<T>.MoveNext()
on your enumerator.IEnumerator<T>.Current
element from the IEnumerator interface while MoveNext()
returns true
.IEnumerator<T>
.What we know about List<T>
is that it is an in-memory collection, so the MoveNext()
function on its enumerator is going to be very cheap. It looks like your collection gives an enumerator whose MoveNext()
method is more expensive, perhaps because it is interacting with some external resource such as a database connection.
When you call ToList()
on your IEnumerable<T>
, you are running a full iteration of your collection and loading all of the elements into memory with that iteration. This is worth doing if you expect to be iterating through the same collection multiple times. If you expect to iterate through the collection only once, then ToList()
is a false economy: all it does is to create an in-memory collection that will later have to be garbage collected.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With