Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why IEnumerable slow and List is fast?

Tags:

Came across this code.

var dic = new Dictionary<int, string>();
for(int i=0; i<20000; i++)
{
    dic.Add(i, i.ToString());
}

var list = dic.Where(f => f.Value.StartsWith("1")).Select(f => f.Key);//.ToList(); //uncomment for fast results 
Console.WriteLine(list.GetType());
var list2 = dic.Where(f => list.Contains(f.Key)).ToList();
Console.WriteLine(list2.Count());

So when .ToList() is commented it's slow, when not - it's fast. Reproducable here How could this be explained? Should I always make everything ToList() to ensure speed (i.e. in which circumstances IEnumerable would be more preferable)? Note I'm talking only about linq to objects, I know linq to sql laziness and stuff.

like image 852
ren Avatar asked Oct 30 '13 17:10

ren


People also ask

Is IEnumerable slow?

2) if you iterate elements many times, or if you get an one element for multiple times, so, IEnumerable will be slow. Because of IEnumerable it does not contain result, so that it creates an new instance of result when it's used. It means, when you access an same element multiple times, each will be different object!

Should Use IEnumerable instead of List?

“IEnumerable describes behaviour, while List is an implementation of that behaviour. When you use IEnumerable, you give the compiler a chance to defer work until later, possibly optimising along the way. If you use ToList() you force the compiler to reify the results right away.”

What's the difference between IEnumerable T and List t?

One important difference between IEnumerable and List (besides one being an interface and the other being a concrete class) is that IEnumerable is read-only and List is not. So if you need the ability to make permanent changes of any kind to your collection (add & remove), you'll need List.

What is difference between IQueryable and List in C#?

IQueryable is useful when we want to iterate a collection of objects which deals with ad-hoc queries against the data source or remote database, like SQL Server. IList is useful when we want to perform any operation like Add, Remove or Get item at specific index position in the collection.


1 Answers

This is because of deferred execution: when you comment out ToList, the enumeration is produced by evaluating the sequence of filters for each item in the dictionary. When you do a ToList, however, the sequence is "materialized" in memory, so all the evaluations are performed exactly once.

The logic behind the second Where without ToList looks like this:

// The logic is expanded for illustration only.
var list2 = new List<KeyValuePair<int,string>>();
foreach (var d in dict) {
    var list = new List<int>();
    // This nested loop does the same thing on each iteration,
    // redoing n times what could have been done only once.
    foreach (var f in dict) {
        if (f.Value.StartsWith("1")) {
            list.Add(f.Key);
        }
    }
    if (list.Contains(d.Key)) {
        list2.Add(d);
    }
}

The logic with ToList looks like this:

// The list is prepared once, and left alone
var list = new List<int>();
foreach (var f in dict) {
    if (f.Value.StartsWith("1")) {
        list.Add(f.Key);
    }
}
var list2 = new List<KeyValuePair<int,string>>();
// This loop uses the same list in all its iterations.
foreach (var d in dict) {
    if (list.Contains(d.Key)) {
        list2.Add(d);
    }
}

As you can see, the ToList transforms an O(n^2) program with two nested loops of size n into O(2*n) with two sequential loops of size n each.

like image 51
Sergey Kalinichenko Avatar answered Sep 17 '22 13:09

Sergey Kalinichenko