Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LINQ Grouping by custom type not working

I have data I receive from a web service via HTTPWebRequest. After I parse it using NewtonSoft.Deserialize into a custom type (a simple class with public string properties), I want to manipulate this data using LINQ - more specifically, I want to group the data.

My problem is that the grouping works fine if I group by a single string property

from x in myList
group x by x.myStr into grp
select grp;

Since I want to group by more columns, I am returning a custom type with

new MyType { a = ..., b = ... }

The group is however not working. I thought the reason must be the compiler does not know how to compare these objects - so if this type implements IEqualityComparer<MyType> it will solve it.

But no, it is still not grouping accordingly, and it creates several keys with the exact same string values.

The custom type by which I am grouping is something like

public class MyType
{
    public string a;
    public string b;
    public string c;
}

Any ideas of what am I missing?

Here's a concrete example of the scenario described above:

//The type that models the data returned from the web service
public class MyClass
{
    public string a { get; set; }

    public string b { get; set; }

    public string c { get; set; }

    public DateTime d { get; set; }

    public DateTime e { get; set; }
}

// the type by which I want to group my data
public class MyGroup : IEquatable<MyGroup>, IEqualityComparer<MyGroup>
{
    public string f1 { get; set; }

    public DateTime d1 { get; set; }

    public DateTime d2 { get; set; }

    public bool Equals(MyGroup other)
    {
        return string.Compare(this.f1, other.f1) == 0;
    }

    public bool Equals(MyGroup x, MyGroup y)
    {
        return string.Compare(x.f1, y.f1) == 0;
    }

    public int GetHashCode(MyGroup obj)
    {
        return obj.GetHashCode();
    }
}    
    List<MyClass> l = new List<MyClass>();
    l.Add(new MyClass { a = "aaa", b = "bbb", c = "ccc", d = DateTime.ParseExact("20081405", "yyyyddMM", Thread.CurrentThread.CurrentCulture), e = DateTime.ParseExact("20140101", "yyyyddMM", Thread.CurrentThread.CurrentCulture) });
    l.Add(new MyClass { a = "aaaa", b = "bbb", c = "ccc", d = DateTime.ParseExact("20090105", "yyyyddMM", Thread.CurrentThread.CurrentCulture), e = DateTime.ParseExact("20140201", "yyyyddMM", Thread.CurrentThread.CurrentCulture) });
    l.Add(new MyClass { a = "aa", b = "bbbb", c = "cccc", d = DateTime.ParseExact("20081405", "yyyyddMM", Thread.CurrentThread.CurrentCulture), e = DateTime.ParseExact("20140201", "yyyyddMM", Thread.CurrentThread.CurrentCulture) });
    l.Add(new MyClass { a = "aaa", b = "bbbbb", c = "ccc", d = DateTime.ParseExact("20121111", "yyyyddMM", Thread.CurrentThread.CurrentCulture), e = DateTime.ParseExact("20140101", "yyyyddMM", Thread.CurrentThread.CurrentCulture) });
    l.Add(new MyClass { a = "aaaaa", b = "bbb", c = "ccc", d = DateTime.ParseExact("20081405", "yyyyddMM", Thread.CurrentThread.CurrentCulture), e = DateTime.ParseExact("20140101", "yyyyddMM", Thread.CurrentThread.CurrentCulture) });
    l.Add(new MyClass { a = "aaaa", b = "bbbbb", c = "ccc", d = DateTime.ParseExact("20121111", "yyyyddMM", Thread.CurrentThread.CurrentCulture), e = DateTime.ParseExact("20140101", "yyyyddMM", Thread.CurrentThread.CurrentCulture) });
    l.Add(new MyClass { a = "aaaa", b = "bbbb", c = "cccccc", d = DateTime.ParseExact("20081405", "yyyyddMM", Thread.CurrentThread.CurrentCulture), e = DateTime.ParseExact("20140201", "yyyyddMM", Thread.CurrentThread.CurrentCulture) });
    l.Add(new MyClass { a = "aaaaa", b = "bbb", c = "cccc", d = DateTime.ParseExact("20090105", "yyyyddMM", Thread.CurrentThread.CurrentCulture), e = DateTime.ParseExact("20140301", "yyyyddMM", Thread.CurrentThread.CurrentCulture) });
    l.Add(new MyClass { a = "aaa", b = "bbb", c = "cccc", d = DateTime.ParseExact("20081405", "yyyyddMM", Thread.CurrentThread.CurrentCulture), e = DateTime.ParseExact("20140201", "yyyyddMM", Thread.CurrentThread.CurrentCulture) });

    //The following does not really group
    //IEnumerable<IGrouping<MyGroup, MyClass>> r = from x in l
    IEnumerable<IGrouping<string, MyClass>> r = from x in l
                                                //group x by new MyGroup { f1 = x.a /*, d1 = x.d, d2 = x.e*/ } into grp
                                                orderby x.a
                                                group x by x.a into grp
                                                select grp;

    //foreach (IGrouping<MyGroup, MyClass> g in r)
    foreach (IGrouping<string, MyClass> g in r)
    {
        //Console.WriteLine(g.Key.f1);

        Console.WriteLine(g.Key);
    }
like image 302
Veverke Avatar asked Jul 15 '14 14:07

Veverke


2 Answers

I thought the reason must be the compiler does not know how to compare these objects - so if this type implements IEqualityComparer<MyType> it will solve it.

Actually, to use a custom "equality" check in Linq functions you need to implement IEquatable<T>. IEquatable<T> is used to compare an instance of an object with another object of the same type - while IEqualityProvider<T> is meant to be implemented by an external class to compare two arbitrary Ts (and/or to have multiple methods of determining "equality").

Note that you should also implement Object.Equals and Object.GetHashCode - IEquatable<T> just allows you to compare in a type-safe manner.

Why the need for overriding Object's Equals and GetHashCode?

To ensure that any method (Object.Equals(object), the static Object.Equals(object, object, etc.) used to compare two objects is consistent. And any time you override Equals, you should also override GetHashCode to ensure that objects can be properly stored in a hash-based collection like a Dictionary or HashSet.

What does it mean IEquitable only compares in a type-safe manner?

When using IEquatable<T>, the object you're comparing to is guaranteed to be a T (or a subtype of T), whereas with Object.Equals, you don't know the type of the other object and must check it's type first.

For example:

// IEquatable<T>.Equals()
public bool Equals(MyGroup other)
{
    return string.Compare(this.f1, other.f1) == 0;
}

versus

// Object.Equals()
public bool Equals(object other)
{
    // need to check the type of the passed in object
    MyGroup grp = other as MyGroup;

    // other is not a MyGroup
    if(grp == null return false);        

    return string.Compare(this.f1, grp.f1) == 0;

    // you could also use
    //    return this.Equals(grp);
    // as a shortcut to reuse the same "equality" logic
}
like image 150
D Stanley Avatar answered Sep 28 '22 02:09

D Stanley


Any ideas of what am I missing?

Something like:

public class MyType : IEquatable<MyType>
{
  public string a;
  public string b;
  public string c;

  public bool Equals(MyType other)
  {
    if (other == null)
      return false;

    if (GetType() != other.GetType()) // can be omitted if you mark the CLASS as sealed
      return false;

    return a == other.a && b == other.b && c == other.c;
  }

  public override bool Equals(object obj)
  {
    return Equals(obj as MyType);
  }

  public override int GetHashCode()
  {
    int hash = 0;
    if (a != null)
      hash ^= a.GetHashCode();
    if (b != null)
      hash ^= b.GetHashCode();
    if (c != null)
      hash ^= c.GetHashCode();
    return hash;
  }
}

Addition: Note that MyType above is mutable, and the hash code changes if one of the fields a, b and c are re-assigned. That is problematic if the re-assignment happens while the instance is being held in a Dictionary<MyType, whatever>, HashSet<MyType> etc.


Alternatively, you could "group by" an anonymous type as suggested in DavidG's answer, or "group by" Tuple.Create(.. , .. , ..).

like image 31
Jeppe Stig Nielsen Avatar answered Sep 28 '22 02:09

Jeppe Stig Nielsen