I just can't figure out why the item in my filtered list is not found. I have simplified the example to show it. I have a class Item...
public class Item
{
public Item(string name)
{
Name = name;
}
public string Name
{
get; set;
}
public override string ToString()
{
return Name;
}
}
... and a class 'Items' which should filter the items and check if the first item is still in the list...
public class Items
{
private IEnumerable<Item> _items;
public Items(IEnumerable<Item> items)
{
_items = items;
}
public List<Item> Filter(string word)
{
var ret = new List<Item>(_items.Where(x => x.Name.Contains(word)));
Console.WriteLine("found: " + ret.Contains(_items.First()));
// found: false
return ret;
}
}
The executing code looks like this:
static void Main(string[] args)
{
string[] itemNames = new string[] { "a", "b", "c" };
Items list = new Items(itemNames.Select(x => new Item(x)));
list.Filter("a");
Console.ReadLine();
}
Now, if I execute the program, the Console.WriteLine outputs that the item is not found. But why?
If I change the first line in the constructor to
_items = items.ToList()
then, it can find it. If I undo that line and call ToList() later in the Filter-method, it also cannot find the item?!
public class Items
{
private IEnumerable<Item> _items;
public Items(IEnumerable<Item> items)
{
_items = items;
}
public List<Item> FilteredItems
{
get; set;
}
public List<Item> Filter(string word)
{
var ret = new List<Item>(_items.Where(x => x.Name.Contains(word)));
_items = _items.ToList();
Console.WriteLine("found: " + ret.Contains(_items.First()));
// found: false
return ret;
}
}
Why is there a difference where and when the lambda expression is executed and why isn't the item found any more? I don't get it!
The reason is deferred execution.
You intialize the _items
field to
itemNames.Select(x => new Item(x));
This is a query, not the answer to that query. This query is executed every time you iterate over _items
.
So in this line of your Filter
method:
var ret = new List<Item>(_items.Where(x => x.Name.Contains(word)));
the source array is enumerated and a new Item(x)
created for each string. These items are stored in your list ret
.
When you call Contains(_items.First())
after that, First()
again executes the query in _items
, creating new Item
instances for each source string.
Since Item
's Equals
method is probably not overridden and performs a simple reference equality check, the first Item
returned from the second iteration is a different instance of Item
than the one in your list.
Let's remove extra code to see the problem:
var itemNames = new [] { "a", "b", "c" };
var items1 = itemNames.Select(x => new Item(x));
var surprise = items1.Contains(items1.First()); // False
The collection items1
appears not to contain its initial element! (demo)
Adding ToList()
fixes the problem:
var items2 = itemNames.Select(x => new Item(x)).ToList();
var noSurprise = items2.Contains(items2.First()); // True
The reason why you see different results with and without ToList()
is that (1) items1
is evaluated lazily, and (2) your Item
class does not implement Equals
/GetHashCode
. Using ToList()
makes default equality work; implementing custom equality check would fix the problem for multiple enumeration.
The main lesson from this exercise is that storing IEnumerable<T>
that is passed to your constructor is dangerous. This is only one of the reasons; other reasons include multiple enumeration and possible modification of the sequence after your code has validated its input. You should call ToList
or ToArray
on sequences passed into constructors to avoid these problems:
public Items(IEnumerable<Item> items) {
_items = items.ToList();
}
There are two problems in your code.
First problem is that you are initializing a new item every time. That is you don't store the actual items here when you write.
IEnumerable<Item> items = itemNames.Select(x => new Item(x));
The execution of Select
is deferred. i.e every time you call .ToList()
a new set of Items is created using itemNames
as source.
Second problem is that you are comparing items by reference here.
Console.WriteLine("found: " + ret.Contains(_items.First()));
When you use ToList
you store items in list and the references remains same so you will find item with reference.
When you don't use ToList
the references are not same any more. because everytime a new Item is created. you cant find your item with different reference.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With