Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Partition/split/section IEnumerable<T> into IEnumerable<IEnumerable<T>> based on a function using LINQ?

I'd like to split a sequence in C# to a sequence of sequences using LINQ. I've done some investigation, and the closest SO article I've found that is slightly related is this.

However, this question only asks how to partition the original sequence based upon a constant value. I would like to partition my sequence based on an operation.

Specifically, I have a list of objects which contain a decimal property.

public class ExampleClass
{
    public decimal TheValue { get; set; }
}

Let's say I have a sequence of ExampleClass, and the corresponding sequence of values of TheValue is:

{0,1,2,3,1,1,4,6,7,0,1,0,2,3,5,7,6,5,4,3,2,1}

I'd like to partition the original sequence into an IEnumerable<IEnumerable<ExampleClass>> with values of TheValue resembling:

{{0,1,2,3}, {1,1,4,6,7}, {0,1}, {0,2,3,5,7}, {6,5,4,3,2,1}}

I'm just lost on how this would be implemented. SO, can you help?

I have a seriously ugly solution right now, but have a "feeling" that LINQ will increase the elegance of my code.

like image 383
a developer Avatar asked Jul 11 '11 16:07

a developer


3 Answers

Okay, I think we can do this...

public static IEnumerable<IEnumerable<TElement>>
    PartitionMontonically<TElement, TKey>
    (this IEnumerable<TElement> source,
     Func<TElement, TKey> selector)
{
    // TODO: Argument validation and custom comparisons
    Comparer<TKey> keyComparer = Comparer<TKey>.Default;

    using (var iterator = source.GetEnumerator())
    {
        if (!iterator.MoveNext())
        {
            yield break;
        }
        TKey currentKey = selector(iterator.Current);
        List<TElement> currentList = new List<TElement> { iterator.Current };
        int sign = 0;
        while (iterator.MoveNext())
        {
            TElement element = iterator.Current;
            TKey key = selector(element);
            int nextSign = Math.Sign(keyComparer.Compare(currentKey, key));

            // Haven't decided a direction yet
            if (sign == 0)
            {
                sign = nextSign;
                currentList.Add(element);
            }
            // Same direction or no change
            else if (sign == nextSign || nextSign == 0)
            {
                currentList.Add(element);
            }
            else // Change in direction: yield current list and start a new one
            {
                yield return currentList;
                currentList = new List<TElement> { element };
                sign = 0;
            }
            currentKey = key;
        }
        yield return currentList;
    }
}

Completely untested, but I think it might work...

like image 93
Jon Skeet Avatar answered Oct 18 '22 21:10

Jon Skeet


alternatively with linq operators and some abuse of .net closures by reference.

public static IEnumerable<IEnumerable<T>> Monotonic<T>(this IEnumerable<T> enumerable)
{
  var comparator = Comparer<T>.Default;
  int i = 0;
  T last = default(T);
  return enumerable.GroupBy((value) => { i = comparator.Compare(value, last) > 0 ? i : i+1; last = value; return i; }).Select((group) => group.Select((_) => _));
}

Taken from some random utility code for partitioning IEnumerable's into a makeshift table for logging. If I recall properly, the odd ending Select is to prevent ambiguity when the input is an enumeration of strings.

like image 1
Krypes Avatar answered Oct 18 '22 20:10

Krypes


Here's a custom LINQ operator which splits a sequence according to just about any criteria. Its parameters are:

  1. xs: the input element sequence.
  2. func: a function which accepts the "current" input element and a state object, and returns as a tuple:
    • a bool stating whether the input sequence should be split before the "current" element; and
    • a state object which will be passed to the next invocation of func.
  3. initialState: the state object that gets passed to func on its first invocation.

Here it is, along with a helper class (required because yield return apparently cannot be nested):

public static IEnumerable<IEnumerable<T>> Split<T, TState>(
                  this IEnumerable<T> xs,
                  Func<T, TState, Tuple<bool, TState>> func, 
                  TState initialState)
{
    using (var splitter = new Splitter<T, TState>(xs, func, initialState))
    {
        while (splitter.HasNext)
        {
            yield return splitter.GetNext();
        }
    }
}
internal sealed class Splitter<T, TState> : IDisposable
{
    public Splitter(IEnumerable<T> xs, 
                    Func<T, TState, Tuple<bool, TState>> func, 
                    TState initialState)
    {
        this.xs = xs.GetEnumerator();
        this.func = func;
        this.state = initialState;
        this.hasNext = this.xs.MoveNext();
    }

    private readonly IEnumerator<T> xs;
    private readonly Func<T, TState, Tuple<bool, TState>> func;
    private bool hasNext;
    private TState state;

    public bool HasNext { get { return hasNext; } }

    public IEnumerable<T> GetNext()
    {
        while (hasNext)
        {
            Tuple<bool, TState> decision = func(xs.Current, state);
            state = decision.Item2;
            if (decision.Item1) yield break;
            yield return xs.Current;
            hasNext = xs.MoveNext();
        }
    }

    public void Dispose() { xs.Dispose(); }
}

Note: Here are some of the design decisions that went into the Split method:

  • It should make only a single pass over the sequence.
  • State is made explicit so that it's possible to keep side effects out of func.
like image 1
stakx - no longer contributing Avatar answered Oct 18 '22 22:10

stakx - no longer contributing