Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Linq IEnumerable Extension Method - How to improve performance?

Tags:

linq

I wrote the following extension method that looks for a consecutive sequence of items that satisfy the predicate passed to it. The number of consecutive items in the sequence is determined by the parameter 'sequenceSize.

As an example, I might have an IEnumerable of integers and I want to find 10 consecutive values that are greater than 100. This extension method will determine if such a sequence exists.

This method works well. But, because of what it must do, it can be slow if there are a sizable number of elements in the IEnumerable because it has to start with the first element, look for consecutive values satisfying the predicate, then go to the second element and do the same etc.

I'm looking for suggestions on how to speed this up. I tried using AsParallel() but that had no impact.

public static IEnumerable<IEnumerable<T>> FindSequenceConsecutive<T>(this IEnumerable<T> sequence, 
                                                                     Predicate<T> predicate, 
                                                                     int sequenceSize)
{
    IEnumerable<T> current = sequence;

    while (current.Count() > sequenceSize)
    {
        IEnumerable<T> window = current.Take(sequenceSize);

        if (window.Where(x => predicate(x)).Count() >= sequenceSize)
            yield return window;

        current = current.Skip(1);
    }
}
like image 834
Randy Minder Avatar asked Dec 16 '22 10:12

Randy Minder


2 Answers

The most likely reason for the slowness of this method is the repeated invocation of .Count(), which will immediately enumerate the sequence to determine the number of elements.

You're likely better off explicitly testing the criteria and keeping track of counts, rather than using Where() and Count() repeatedly.

In general, this method is enumerating the sequence a lot. You might experience a good speed-up if you call .ToList() to enumerate the sequence once, and then perform your operations on the generated list. (Note that this won't work if you expect to use this method on infinite-length sequences.)

Update: You are testing for >= sequenceSize, even though window.Count() == sequenceSize. In other words, you just need All():

if (window.All(x => predicate(x)))
    yield return window;

Not sure how much that will help, but it's semantically clearer at least.

Further Edit: Consider this method:

public static IEnumerable<IEnumerable<T>> FindSequenceConsecutive<T>(this IEnumerable<T> sequence, Predicate<T> predicate, int sequenceSize)
{
    List<T> list = sequence.ToList();
    List<bool> matchList = list.Select(x => predicate(x)).ToList();

    int start = 0;
    int count = list.Count;

    while (start + sequenceSize <= count)
    {
        var range = matchList.GetRange(start, sequenceSize);
        if (range.All(x => x))
            yield return list.GetRange(start, sequenceSize);

        start++;
    }
}

It evaluates the sequence once, and then partitions a list of necessary.

like image 105
dlev Avatar answered Jun 01 '23 00:06

dlev


I'm thinking something like this might work for you, as you can walk over the list once and basically maintain a queue of consecutive items passing the predicate, clearing (all) and dequeueing (one) as necessary.

public static IEnumerable<IEnumerable<T>> FindSequenceConsecutive<T>(this IEnumerable<T> sequence, Predicate<T> predicate, int sequenceSize)
{
    var queue = new Queue<T>();

    foreach (T item in sequence)
    {
        if (predicate(item))
        {
            queue.Enqueue(item);
            if (queue.Count == sequenceSize)
            {
                yield return queue.ToList();
                queue.Dequeue();
            }
        }
        else
        {
            queue.Clear();
        }
    }
}

So writing

int[] array = { 1, 2, 3, 4, 5, 2, 8, 3, 5, 6 };
foreach (var seq in array.FindSequenceConsecutive(i => i > 2, 3))
{
    Console.WriteLine(string.Join(",", seq));
}

Yields

3,4,5
8,3,5
3,5,6
like image 42
Anthony Pegram Avatar answered Jun 01 '23 00:06

Anthony Pegram