Here is an interesting issue I noticed when using the Except
Operator:
I have list of users from which I want to exclude some users:
The list of users is coming from an XML file:
The code goes like this:
interface IUser
{
int ID { get; set; }
string Name { get; set; }
}
class User: IUser
{
#region IUser Members
public int ID
{
get;
set;
}
public string Name
{
get;
set;
}
#endregion
public override string ToString()
{
return ID + ":" +Name;
}
public static IEnumerable<IUser> GetMatchingUsers(IEnumerable<IUser> users)
{
IEnumerable<IUser> localList = new List<User>
{
new User{ ID=4, Name="James"},
new User{ ID=5, Name="Tom"}
}.OfType<IUser>();
var matches = from u in users
join lu in localList
on u.ID equals lu.ID
select u;
return matches;
}
}
class Program
{
static void Main(string[] args)
{
XDocument doc = XDocument.Load("Users.xml");
IEnumerable<IUser> users = doc.Element("Users").Elements("User").Select
(u => new User
{ ID = (int)u.Attribute("id"),
Name = (string)u.Attribute("name")
}
).OfType<IUser>(); //still a query, objects have not been materialized
var matches = User.GetMatchingUsers(users);
var excludes = users.Except(matches); // excludes should contain 6 users but here it contains 8 users
}
}
When I call User.GetMatchingUsers(users)
I get 2 matches as expected.
The issue is that when I call users.Except(matches)
The matching users are not being excluded at all! I am expecting 6 users ut "excludes" contains all 8 users instead.
Since all I'm doing in GetMatchingUsers(IEnumerable<IUser> users)
is taking the IEnumerable<IUser>
and just returning
the IUsers
whose ID's match( 2 IUsers in this case), my understanding is that by default Except
will use reference equality
for comparing the objects to be excluded. Is this not how Except
behaves?
What is even more interesting is that if I materialize the objects using .ToList()
and then get the matching users, and call Except
,
everything works as expected!
Like so:
IEnumerable<IUser> users = doc.Element("Users").Elements("User").Select
(u => new User
{ ID = (int)u.Attribute("id"),
Name = (string)u.Attribute("name")
}
).OfType<IUser>().ToList(); //explicity materializing all objects by calling ToList()
var matches = User.GetMatchingUsers(users);
var excludes = users.Except(matches); // excludes now contains 6 users as expected
I don't see why I should need to materialize objects for calling Except
given that its defined on IEnumerable<T>
?
Any suggesstions / insights would be much appreciated.
a) You need to override GetHashCode function. It MUST return equal values for equal IUser objects. For example:
public override int GetHashCode()
{
return ID.GetHashCode() ^ Name.GetHashCode();
}
b) You need to override object.Equals(object obj) function in classes that implement IUser.
public override bool Equals(object obj)
{
IUser other = obj as IUser;
if (object.ReferenceEquals(obj, null)) // return false if obj is null OR if obj doesn't implement IUser
return false;
return (this.ID == other.ID) && (this.Name == other.Name);
}
c) As an alternative to (b) IUser may inherit IEquatable:
interface IUser : IEquatable<IUser>
...
User class will need to provide bool Equals(IUser other) method in that case.
That's all. Now it works without calling .ToList() method.
I think I know why this fails to work as expected. Because the initial user list is a LINQ expression, it is re-evaluated each time it is iterated (once when used in GetMatchingUsers
and again when doing the Except
operation) and so, new user objects are created. This would lead to different references and so no matches. Using ToList
fixes this because it iterates the LINQ query once only and so the references are fixed.
I've been able to reproduce the problem you have and having investigated the code, this seems like a very plausible explanation. I haven't proved it yet, though.
Update
I just ran the test but outputting the users
collection before the call to GetMatchingUsers
, in that call, and after it. Each time the hash code for the object was output and they do indeed have different values each time indicating new objects, as I suspected.
Here is the output for each of the calls:
==> Start
ID=1, Name=Jeff, HashCode=39086322
ID=2, Name=Alastair, HashCode=36181605
ID=3, Name=Anthony, HashCode=28068188
ID=4, Name=James, HashCode=33163964
ID=5, Name=Tom, HashCode=14421545
ID=6, Name=David, HashCode=35567111
<== End
==> Start
ID=1, Name=Jeff, HashCode=65066874
ID=2, Name=Alastair, HashCode=34160229
ID=3, Name=Anthony, HashCode=63238509
ID=4, Name=James, HashCode=11679222
ID=5, Name=Tom, HashCode=35410979
ID=6, Name=David, HashCode=57416410
<== End
==> Start
ID=1, Name=Jeff, HashCode=61940669
ID=2, Name=Alastair, HashCode=15193904
ID=3, Name=Anthony, HashCode=6303833
ID=4, Name=James, HashCode=40452378
ID=5, Name=Tom, HashCode=36009496
ID=6, Name=David, HashCode=19634871
<== End
And, here is the modified code to show the problem:
using System.Xml.Linq;
using System.Collections.Generic;
using System.Linq;
using System;
interface IUser
{
int ID
{
get;
set;
}
string Name
{
get;
set;
}
}
class User : IUser
{
#region IUser Members
public int ID
{
get;
set;
}
public string Name
{
get;
set;
}
#endregion
public override string ToString()
{
return ID + ":" + Name;
}
public static IEnumerable<IUser> GetMatchingUsers(IEnumerable<IUser> users)
{
IEnumerable<IUser> localList = new List<User>
{
new User{ ID=4, Name="James"},
new User{ ID=5, Name="Tom"}
}.OfType<IUser>();
OutputUsers(users);
var matches = from u in users
join lu in localList
on u.ID equals lu.ID
select u;
return matches;
}
public static void OutputUsers(IEnumerable<IUser> users)
{
Console.WriteLine("==> Start");
foreach (IUser user in users)
{
Console.WriteLine("ID=" + user.ID.ToString() + ", Name=" + user.Name + ", HashCode=" + user.GetHashCode().ToString());
}
Console.WriteLine("<== End");
}
}
class Program
{
static void Main(string[] args)
{
XDocument doc = new XDocument(
new XElement(
"Users",
new XElement("User", new XAttribute("id", "1"), new XAttribute("name", "Jeff")),
new XElement("User", new XAttribute("id", "2"), new XAttribute("name", "Alastair")),
new XElement("User", new XAttribute("id", "3"), new XAttribute("name", "Anthony")),
new XElement("User", new XAttribute("id", "4"), new XAttribute("name", "James")),
new XElement("User", new XAttribute("id", "5"), new XAttribute("name", "Tom")),
new XElement("User", new XAttribute("id", "6"), new XAttribute("name", "David"))));
IEnumerable<IUser> users = doc.Element("Users").Elements("User").Select
(u => new User
{
ID = (int)u.Attribute("id"),
Name = (string)u.Attribute("name")
}
).OfType<IUser>(); //still a query, objects have not been materialized
User.OutputUsers(users);
var matches = User.GetMatchingUsers(users);
User.OutputUsers(users);
var excludes = users.Except(matches); // excludes should contain 6 users but here it contains 8 users
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With