Came across this code.
var dic = new Dictionary<int, string>();
for(int i=0; i<20000; i++)
{
dic.Add(i, i.ToString());
}
var list = dic.Where(f => f.Value.StartsWith("1")).Select(f => f.Key);//.ToList(); //uncomment for fast results
Console.WriteLine(list.GetType());
var list2 = dic.Where(f => list.Contains(f.Key)).ToList();
Console.WriteLine(list2.Count());
So when .ToList() is commented it's slow, when not - it's fast. Reproducable here How could this be explained? Should I always make everything ToList() to ensure speed (i.e. in which circumstances IEnumerable would be more preferable)? Note I'm talking only about linq to objects, I know linq to sql laziness and stuff.
2) if you iterate elements many times, or if you get an one element for multiple times, so, IEnumerable will be slow. Because of IEnumerable it does not contain result, so that it creates an new instance of result when it's used. It means, when you access an same element multiple times, each will be different object!
“IEnumerable describes behaviour, while List is an implementation of that behaviour. When you use IEnumerable, you give the compiler a chance to defer work until later, possibly optimising along the way. If you use ToList() you force the compiler to reify the results right away.”
One important difference between IEnumerable and List (besides one being an interface and the other being a concrete class) is that IEnumerable is read-only and List is not. So if you need the ability to make permanent changes of any kind to your collection (add & remove), you'll need List.
IQueryable is useful when we want to iterate a collection of objects which deals with ad-hoc queries against the data source or remote database, like SQL Server. IList is useful when we want to perform any operation like Add, Remove or Get item at specific index position in the collection.
This is because of deferred execution: when you comment out ToList
, the enumeration is produced by evaluating the sequence of filters for each item in the dictionary. When you do a ToList
, however, the sequence is "materialized" in memory, so all the evaluations are performed exactly once.
The logic behind the second Where
without ToList
looks like this:
// The logic is expanded for illustration only.
var list2 = new List<KeyValuePair<int,string>>();
foreach (var d in dict) {
var list = new List<int>();
// This nested loop does the same thing on each iteration,
// redoing n times what could have been done only once.
foreach (var f in dict) {
if (f.Value.StartsWith("1")) {
list.Add(f.Key);
}
}
if (list.Contains(d.Key)) {
list2.Add(d);
}
}
The logic with ToList
looks like this:
// The list is prepared once, and left alone
var list = new List<int>();
foreach (var f in dict) {
if (f.Value.StartsWith("1")) {
list.Add(f.Key);
}
}
var list2 = new List<KeyValuePair<int,string>>();
// This loop uses the same list in all its iterations.
foreach (var d in dict) {
if (list.Contains(d.Key)) {
list2.Add(d);
}
}
As you can see, the ToList
transforms an O(n^2)
program with two nested loops of size n
into O(2*n)
with two sequential loops of size n
each.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With