Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split IEnumerable in three parts: "above", "item", "below" with efficiency

IEnumerable<int> list = new[] { 1, 2, 7, 3, 11, 5 };
int item = (from x in list where (x == list.Max()) select x).First();
IEnumerable<int> above = from x in list where list.ToList().IndexOf(item) > list.ToList().IndexOf(x) select x;
IEnumerable<int> below = from x in list where list.ToList().IndexOf(item) < list.ToList().IndexOf(x) select x;

I want to find an element in an IEnumerable and split the IEnumerable into IEnumerables that no longer contains the element that i found. The code above illustrates the result that i want to achieve, however i had to convert the IEnumerable to a List.

I feel like there has to be a way to do that using LinQ and IEnumerable only. How can this be done?

like image 289
Johannes Avatar asked Sep 03 '14 17:09

Johannes


1 Answers

So we have several sub-problems here. The first problem is returning the item in a collection with the highest value of some projection of that item. Max only compares the item itself or, if given a projection, returns the result of that projection.

public static TSource MaxBy<TSource, TKey>(this IEnumerable<TSource> source
    , Func<TSource, TKey> selector
    , IComparer<TKey> comparer = null)
{
    if (comparer == null)
    {
        comparer = Comparer<TKey>.Default;
    }
    using (IEnumerator<TSource> iterator = source.GetEnumerator())
    {
        if (!iterator.MoveNext())
        {
            throw new ArgumentException("Source was empty");
        }

        TSource maxItem = iterator.Current;
        TKey maxValue = selector(maxItem);

        while (iterator.MoveNext())
        {
            TKey nextValue = selector(iterator.Current);
            if (comparer.Compare(nextValue, maxValue) > 0)
            {
                maxValue = nextValue;
                maxItem = iterator.Current;
            }
        }
        return maxItem;
    }
}

This allows us to much more efficiently get the index of the item with the largest value:

var splitPoint = list.Select((index, number) => new { index, number })
    .MaxBy(pair => pair.number)
    .index;

Next to split the collection you can simply use skip/take:

var firstHalf = list.Take(index);
var secondHalf = list.Skip(index + 1);

There are a number of different problems with the code that you have that are resolve here.

You compute the Max value for every single item in your query to get item, rather than computing it once and using that computed value.

You then go, for each item in the list, and copy all of the items to a new list, twice, search through that list to try to find the position of the max item, and then try to find the position of the current item. You then do that whole thing two times. That means that you copy the entire array into a list four times for each item, search for the position of the max item four times per item in the collection, and search through the list linearly to find the the index of the current item (something you could compute in approximately no time by just counting) twice for each item. This will scale...poorly, as the number of items increases.

The code here finds the index of the max item in a single pass of the collection, and then creates sequences that represent each half that each have virtually no overhead at all beyond simply iterating through those items.

like image 178
Servy Avatar answered Oct 13 '22 01:10

Servy