Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LINQ GroupBy continuous time

Assuming I have a simple structure that looks like this:

public class Range
{
    public DateTime Start { get; set; }
    public DateTime End { get; set; }

    public Range(DateTime start, DateTime end)
    {
        this.Start = start;
        this.End = end;
    }
}

And I create an collection like so:

var dr1 = new Range(new DateTime(2011, 11, 1, 12, 0, 0), 
    new DateTime(2011, 11, 1, 13, 0, 0));
var dr2 = new Range(new DateTime(2011, 11, 1, 13, 0, 0), 
    new DateTime(2011, 11, 1, 14, 0, 0));
var dr3 = new Range(new DateTime(2011, 11, 1, 14, 0, 0), 
    new DateTime(2011, 11, 1, 15, 0, 0));
var dr4 = new Range(new DateTime(2011, 11, 1, 16, 0, 0), 
    new DateTime(2011, 11, 1, 17, 0, 0));

var ranges = new List<Range>() { dr1, dr2, dr3, dr4 };

What I want to do is group the ranges where they are continuous - i.e. they are continuous if the End value of the previous Range is the same as the Start of the next.

We can assume that there are no collisions/duplicates or overlaps in the Range values.

In the example posted, I would end up with two groups:

2011-11-1 12:00:00 - 2011-11-1 15:00:00

2011-11-1 16:00:00 - 2011-11-1 17:00:00

It's fairly easy to come up with an iterative solution for this. But is there some LINQ magic I can use to get this in a pretty one-liner?

like image 752
TheNextman Avatar asked Nov 11 '11 21:11

TheNextman


4 Answers

Your best bet is to use yield and an extension method:

static IEnumerable<Range> GroupContinuous(this IEnumerable<Range> ranges)
{
    // Validate parameters.

    // Can order by start date, no overlaps, no collisions
    ranges = ranges.OrderBy(r => r.Start);

    // Get the enumerator.
    using (IEnumerator<Range> enumerator = ranges.GetEnumerator();
    {
        // Move to the first item, if nothing, break.
        if (!enumerator.MoveNext()) yield break;

        // Set the previous range.
        Range previous = enumerator.Current;

        // Cycle while there are more items.
        while (enumerator.MoveNext())
        {
            // Get the current item.
            Range current = enumerator.Current;

            // If the start date is equal to the end date
            // then merge with the previous and continue.
            if (current.Start == previous.End)
            {
                // Merge.
                previous = new Range(previous.Start, current.End);

                // Continue.
                continue;
            }

            // Yield the previous item.
            yield return previous;

            // The previous item is the current item.
            previous = current;
        }

        // Yield the previous item.
        yield return previous;
    }
}

Granted, the call to OrderBy is going to cause a full iteration of the ranges sequence, but there's no avoiding that. Once you have it ordered, you can prevent having to materialize your results before returning them; you simply yield the results if the conditions dictate.

If you know that the sequence is ordered, however, then you don't have to call OrderBy at all, and you can yield items as you traverse the list and break on different collapsed Range instances.

Ultimately, if the sequence is unordered, then you have two options:

  • Order the list and then process (remember, OrderBy is deferred as well, but will have to use one full iteration to order the sequence), using yield to return the item when you have one to process
  • Process the entire sequence at once and return as an entire materialized sequence
like image 107
casperOne Avatar answered Nov 14 '22 01:11

casperOne


A generic version of casperOne's extension method, used as such:

var items = new[]
    {
        // Range 1
        new { A = 0, B = 1 },
        new { A = 1, B = 2 },
        new { A = 2, B = 3 },
        new { A = 3, B = 4 },
        // Range 2
        new { A = 5, B = 6 },
        new { A = 6, B = 7 },
        new { A = 7, B = 8 },
        new { A = 8, B = 9 },
    };

var ranges = items.ContinousRange(
    x => x.A,
    x => x.B,
    (lower, upper) => new { A = lower, B = upper });

foreach(var range in ranges)
{
    Console.WriteLine("{0} - {1}", range.A, range.B);
}

Implementation of extension method

    /// <summary>
    /// Calculates continues ranges based on individual elements lower and upper selections. Cannot compensate for overlapping.
    /// </summary>
    /// <typeparam name="T">The type containing a range</typeparam>
    /// <typeparam name="T1">The type of range values</typeparam>
    /// <param name="source">The ranges to be combined</param>
    /// <param name="lowerSelector">The range's lower bound</param>
    /// <param name="upperSelector">The range's upper bound</param>
    /// <param name="factory">A factory to create a new range</param>
    /// <returns>An enumeration of continuous ranges</returns>
    public static IEnumerable<T> ContinousRange<T, T1>(this IEnumerable<T> source, Func<T, T1> lowerSelector, Func<T, T1> upperSelector, Func<T1, T1, T> factory)
    {
        //Validate parameters

        // Can order by start date, no overlaps, no collisions
        source = source.OrderBy(lowerSelector);

        // Get enumerator
        using(var enumerator = source.GetEnumerator())
        {
            // Move to the first item, if nothing, break.
            if (!enumerator.MoveNext()) yield break;

            // Set the previous range.
            var previous = enumerator.Current;

            // Cycle while there are more items
            while(enumerator.MoveNext())
            {
                // Get the current item.
                var current = enumerator.Current;

                // If the start date is equal to the end date
                // then merge with the previoud and continue
                if (lowerSelector(current).Equals(upperSelector(previous)))
                {
                    // Merge
                    previous = factory(lowerSelector(previous), upperSelector(current));

                    // Continue
                    continue;
                }

                // Yield the previous item.
                yield return previous;

                // The previous item is the current item.
                previous = current;
            }

            // Yield the previous item.
            yield return previous;
        }
    }
like image 44
cwharris Avatar answered Nov 14 '22 01:11

cwharris


You could use the Aggregate() method and a lambda to group them together. This is, as you say, assuming no duplicates or overlaps:

// build your set of continuous ranges for results
List<Range> continuousRanges = new List<Range>();

ranges.Aggregate(continuousRanges, (con, next) => {
{
    // pull last item (or default if none) - O(1) for List<T>
    var last = continuousRanges.FirstOrDefault(r => r.End == next.Start);

    if (last != null)
        last.End = next.End;
    else
        con.Add(next);

    return con;
});   

Now, if you know the ranges are ordered, you could get away with always comparing the next against the last one we processed, like so:

// build your set of continuous ranges for results
List<Range> continuousRanges = new List<Range>();

ranges.Aggregate(continuousRanges, (con, next) => {
{
    // pull last item (or default if none) - O(1) for List<T>
    var last = continuousRanges.LastOrDefault();

    if (last != null && last.End == next.Start)
        last.End = next.End;
    else
        con.Add(next);

    return con;
});   
like image 3
James Michael Hare Avatar answered Nov 14 '22 01:11

James Michael Hare


Here is another LINQ solution. It finds the start of each continuous range with one query, the end of each continuous range with another, and then goes through the pairs to construct new ranges.

var starts = ranges.Where((r, i) => i == 0 || r.Start != ranges[i - 1].End);
var ends = ranges.Where((r, i) => i == ranges.Count - 1 || r.End != ranges[i + 1].Start);
var result = starts.Zip(ends, (s, e) => new Range(s.Start, e.End));

It could be rewritten into a one-liner, but the separate version is clearer and easier to maintain:

var result = ranges.Where((r, i) => i == 0 || r.Start != ranges[i - 1].End).Zip(ranges.Where((r, i) => i == ranges.Count - 1 || r.End != ranges[i + 1].Start), (s, e) => new Range(s.Start, e.End));
like image 2
Matthew Strawbridge Avatar answered Nov 14 '22 01:11

Matthew Strawbridge