Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

An item in IEnumerable does not equal an item in List

I just can't figure out why the item in my filtered list is not found. I have simplified the example to show it. I have a class Item...

public class Item
{
    public Item(string name)
    {
        Name = name;
    }

    public string Name
    {
        get; set;
    }

    public override string ToString()
    {
        return Name;
    }
}

... and a class 'Items' which should filter the items and check if the first item is still in the list...

public class Items
{
    private IEnumerable<Item> _items;

    public Items(IEnumerable<Item> items)
    {
        _items = items;
    }

    public List<Item> Filter(string word)
    {
        var ret = new List<Item>(_items.Where(x => x.Name.Contains(word)));

        Console.WriteLine("found: " + ret.Contains(_items.First()));
        // found: false

        return ret;
    }
}

The executing code looks like this:

static void Main(string[] args)
{
    string[] itemNames = new string[] { "a", "b", "c" };

    Items list = new Items(itemNames.Select(x => new Item(x)));
    list.Filter("a");

    Console.ReadLine();
}

Now, if I execute the program, the Console.WriteLine outputs that the item is not found. But why?

If I change the first line in the constructor to

 _items = items.ToList()

then, it can find it. If I undo that line and call ToList() later in the Filter-method, it also cannot find the item?!

public class Items
{
    private IEnumerable<Item> _items;

    public Items(IEnumerable<Item> items)
    {
        _items = items;
    }

    public List<Item> FilteredItems
    {
        get; set;
    }

    public List<Item> Filter(string word)
    {
        var ret = new List<Item>(_items.Where(x => x.Name.Contains(word)));

        _items = _items.ToList();
        Console.WriteLine("found: " + ret.Contains(_items.First()));
        // found: false

        return ret;
    }
}

Why is there a difference where and when the lambda expression is executed and why isn't the item found any more? I don't get it!

like image 365
melwynoo Avatar asked Dec 21 '16 15:12

melwynoo


3 Answers

The reason is deferred execution.

You intialize the _items field to

itemNames.Select(x => new Item(x));

This is a query, not the answer to that query. This query is executed every time you iterate over _items.

So in this line of your Filter method:

var ret = new List<Item>(_items.Where(x => x.Name.Contains(word)));

the source array is enumerated and a new Item(x) created for each string. These items are stored in your list ret.

When you call Contains(_items.First()) after that, First() again executes the query in _items, creating new Item instances for each source string.

Since Item's Equals method is probably not overridden and performs a simple reference equality check, the first Item returned from the second iteration is a different instance of Item than the one in your list.

like image 127
René Vogt Avatar answered Sep 20 '22 15:09

René Vogt


Let's remove extra code to see the problem:

var itemNames = new [] { "a", "b", "c" };
var items1 = itemNames.Select(x => new Item(x));
var surprise = items1.Contains(items1.First());   // False

The collection items1 appears not to contain its initial element! (demo)

Adding ToList() fixes the problem:

var items2 = itemNames.Select(x => new Item(x)).ToList();
var noSurprise = items2.Contains(items2.First()); // True

The reason why you see different results with and without ToList() is that (1) items1 is evaluated lazily, and (2) your Item class does not implement Equals/GetHashCode. Using ToList() makes default equality work; implementing custom equality check would fix the problem for multiple enumeration.

The main lesson from this exercise is that storing IEnumerable<T> that is passed to your constructor is dangerous. This is only one of the reasons; other reasons include multiple enumeration and possible modification of the sequence after your code has validated its input. You should call ToList or ToArray on sequences passed into constructors to avoid these problems:

public Items(IEnumerable<Item> items) {
    _items = items.ToList();
}
like image 23
Sergey Kalinichenko Avatar answered Sep 19 '22 15:09

Sergey Kalinichenko


There are two problems in your code.

First problem is that you are initializing a new item every time. That is you don't store the actual items here when you write.

IEnumerable<Item> items = itemNames.Select(x => new Item(x));

The execution of Select is deferred. i.e every time you call .ToList() a new set of Items is created using itemNames as source.

Second problem is that you are comparing items by reference here.

Console.WriteLine("found: " + ret.Contains(_items.First()));

When you use ToList you store items in list and the references remains same so you will find item with reference.

When you don't use ToList the references are not same any more. because everytime a new Item is created. you cant find your item with different reference.

like image 33
M.kazem Akhgary Avatar answered Sep 18 '22 15:09

M.kazem Akhgary