Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use IEnumerable.GroupBy comparing multiple properties between elements?

Tags:

c#

linq

How do I group "adjacent" Sites:

Given data:

List<Site> sites = new List<Site> {
    new Site { RouteId="A", StartMilepost=0.00m, EndMilepost=1.00m },
    new Site { RouteId="A", StartMilepost=1.00m, EndMilepost=2.00m },
    new Site { RouteId="A", StartMilepost=5.00m, EndMilepost=7.00m },
    new Site { RouteId="B", StartMilepost=3.00m, EndMilepost=5.00m },
    new Site { RouteId="B", StartMilepost=11.00m, EndMilepost=13.00m },
    new Site { RouteId="B", StartMilepost=13.00m, EndMilepost=14.00m },
};

I want result:

[
    [
        Site { RouteId="A", StartMilepost=0.00m, EndMilepost=1.00m },
        Site { RouteId="A", StartMilepost=1.00m, EndMilepost=2.00m }
    ],
    [
        Site { RouteId="A", StartMilepost=5.00m, EndMilepost=7.00m }
    ],
    [
        Site { RouteId="B", StartMilepost=3.00m, EndMilepost=5.00m }
    ],
    [
        Site { RouteId="B", StartMilepost=11.00m, EndMilepost=13.00m },
        Site { RouteId="B", StartMilepost=13.00m, EndMilepost=14.00m }
    ]
]

I tried using GroupBy with a custom comparer function checking routeIds match and first site's end milepost is equal to the next sites start milepost. My HashKey function just checks out routeId so all sites within a route will get binned together but I think the comparer makes an assumption like if A = B, and B = C, then A = C, so C won't get grouped with A,B,C since in my adjacency case, A will not equal C.

like image 556
xb4rrm27m5 Avatar asked May 13 '19 14:05

xb4rrm27m5


1 Answers

First, let Site class be (for debugging / demonstration)

public class Site {
  public Site() { }

  public string RouteId;
  public Decimal StartMilepost;
  public Decimal EndMilepost;

  public override string ToString() => $"{RouteId} {StartMilepost}..{EndMilepost}";
}

Well, as you can see we have to break the rules: equality must be transitive, i.e. whenever

A equals B
B equals C

then

A equals C

It's not the case in your example. However, if we sort the sites by StartMilepost we, technically, can implement IEqualityComparer<Site> like this:

public class MySiteEqualityComparer : IEqualityComparer<Site> {
  public bool Equals(Site x, Site y) {
    if (ReferenceEquals(x, y))
      return true;
    else if (null == x || null == y)
      return false;
    else if (x.RouteId != y.RouteId)
      return false;
    else if (x.StartMilepost <= y.StartMilepost && x.EndMilepost >= y.StartMilepost)
      return true;
    else if (y.StartMilepost <= x.StartMilepost && y.EndMilepost >= x.StartMilepost)
      return true;

    return false;
  }

  public int GetHashCode(Site obj) {
    return obj == null
      ? 0
      : obj.RouteId == null
         ? 0
         : obj.RouteId.GetHashCode();
  }
}

then GroupBy as usual; please, note that OrderBy is required, since order of comparision matters here. Suppose we have

A = {RouteId="X", StartMilepost=0.00m, EndMilepost=1.00m}
B = {RouteId="X", StartMilepost=1.00m, EndMilepost=2.00m}
C = {RouteId="X", StartMilepost=2.00m, EndMilepost=3.00m}

Here A == B, B == C (so in case of A, B, C all items will be in the same group) but A != C (and thus in A, C, B will end up with 3 groups)

Code:

 List<Site> sites = new List<Site> {
    new Site { RouteId="A", StartMilepost=0.00m, EndMilepost=1.00m },
    new Site { RouteId="A", StartMilepost=1.00m, EndMilepost=2.00m },
    new Site { RouteId="A", StartMilepost=5.00m, EndMilepost=7.00m },
    new Site { RouteId="B", StartMilepost=3.00m, EndMilepost=5.00m },
    new Site { RouteId="B", StartMilepost=11.00m, EndMilepost=13.00m },
    new Site { RouteId="B", StartMilepost=13.00m, EndMilepost=14.00m },
  };

  var result = sites
    .GroupBy(item => item.RouteId)
    .Select(group => group
        // Required Here, since MySiteEqualityComparer breaks the rules
       .OrderBy(item => item.StartMilepost)  
       .GroupBy(item => item, new MySiteEqualityComparer())
       .ToArray())
    .ToArray();

  // Let's have a look
  var report = string.Join(Environment.NewLine, result
    .Select(group => string.Join(Environment.NewLine, 
                                 group.Select(g => string.Join("; ", g)))));

  Console.Write(report);

Outcome:

A 0.00..1.00; A 1.00..2.00
A 5.00..7.00
B 3.00..5.00
B 11.00..13.00; B 13.00..14.00
like image 127
Dmitry Bychenko Avatar answered Oct 05 '22 23:10

Dmitry Bychenko