Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ensure deferred execution will be executed only once or else

I ran into a weird issue and I'm wondering what I should do about it.

I have this class that return a IEnumerable<MyClass> and it is a deferred execution. Right now, there are two possible consumers. One of them sorts the result.

See the following example :

public class SomeClass
{
    public IEnumerable<MyClass> GetMyStuff(Param givenParam)
    {
        double culmulativeSum = 0;
        return myStuff.Where(...)
                      .OrderBy(...)
                      .TakeWhile( o => 
                      {
                          bool returnValue = culmulativeSum  < givenParam.Maximum;
                          culmulativeSum += o.SomeNumericValue;
                          return returnValue; 
                      };
    }
}

Consumers call the deferred execution only once, but if they were to call it more than that, the result would be wrong as the culmulativeSum wouldn't be reset. I've found the issue by inadvertence with unit testing.

The easiest way for me to fix the issue would be to just add .ToArray() and get rid of the deferred execution at the cost of a little bit of overhead.

I could also add unit test in consumers class to ensure they call it only once, but that wouldn't prevent any new consumer coded in the future from this potential issue.

Another thing that came to my mind was to make subsequent execution throw. Something like

return myStuff.Where(...)
       .OrderBy(...)
       .TakeWhile(...)
       .ThrowIfExecutedMoreThan(1);

Obviously this doesn't exist. Would it be a good idea to implement such thing and how would you do it?

Otherwise, if there is a big pink elephant that I don't see, pointing it out will be appreciated. (I feel there is one because this question is about a very basic scenario :| )

EDIT :

Here is a bad consumer usage example :

public class ConsumerClass
{
    public void WhatEverMethod()
    {
        SomeClass some = new SomeClass();
        var stuffs = some.GetMyStuff(param);
        var nb = stuffs.Count(); //first deferred execution
        var firstOne = stuff.First(); //second deferred execution with the culmulativeSum not reset
    }
}
like image 843
AXMIM Avatar asked Nov 10 '16 21:11

AXMIM


People also ask

What is deferred execution?

Deferred query execution In a query that returns a sequence of values, the query variable itself never holds the query results and only stores the query commands. Execution of the query is deferred until the query variable is iterated over in a foreach or For Each loop.

What are the benefits of a deferred execution in LINQ?

Benefits of Deferred Execution –It avoids unnecessary query execution and hence improves performance. Query construction and Query execution are decoupled, so we can create the LINQ query in several steps. A deferred execution query is reevaluated when you re-enumerate – hence we always get the latest data.

Which LINQ methods have deferred execution?

Deferred execution is applicable on any in-memory collection as well as remote LINQ providers like LINQ-to-SQL, LINQ-to-Entities, LINQ-to-XML, etc. In the above example, you can see the query is materialized and executed when you iterate using the foreach loop. This is called deferred execution.

What is deferred in C#?

Deferred execution is supported directly in the C# language by the yield (C# Reference) keyword (in the form of the yield-return statement) when used within an iterator block. Such an iterator must return a collection of type IEnumerator or IEnumerator<T> (or a derived type).


2 Answers

Ivan's answer is very fitting for the underlying issue in OP's example - but for the general case, I have approached this in the past using an extension method similar to the one below. This ensures that the Enumerable has a single evaluation but is also deferred:

public static IMemoizedEnumerable<T> Memoize<T>(this IEnumerable<T> source)
{
    return new MemoizedEnumerable<T>(source);
}

private class MemoizedEnumerable<T> : IMemoizedEnumerable<T>, IDisposable
{
    private readonly IEnumerator<T> _sourceEnumerator;
    private readonly List<T> _cache = new List<T>();

    public MemoizedEnumerable(IEnumerable<T> source)
    {
        _sourceEnumerator = source.GetEnumerator();
    }

    public IEnumerator<T> GetEnumerator()
    {
        return IsMaterialized ? _cache.GetEnumerator() : Enumerate();
    }

    private IEnumerator<T> Enumerate()
    {
        foreach (var value in _cache)
        {
            yield return value;
        }

        while (_sourceEnumerator.MoveNext())
        {
            _cache.Add(_sourceEnumerator.Current);
            yield return _sourceEnumerator.Current;
        }

        _sourceEnumerator.Dispose();
        IsMaterialized = true;
    }

    IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();

    public List<T> Materialize()
    {
        if (IsMaterialized)
            return _cache;

        while (_sourceEnumerator.MoveNext())
        {
            _cache.Add(_sourceEnumerator.Current);
        }

        _sourceEnumerator.Dispose();
        IsMaterialized = true;

        return _cache;
    }

    public bool IsMaterialized { get; private set; }

    void IDisposable.Dispose()
    {
        if(!IsMaterialized)
            _sourceEnumerator.Dispose();
    }
}

public interface IMemoizedEnumerable<T> : IEnumerable<T>
{
    List<T> Materialize();

    bool IsMaterialized { get; }
}

Example Usage:

void Consumer()
{
    //var results = GetValuesComplex();
    //var results = GetValuesComplex().ToList();
    var results = GetValuesComplex().Memoize();

    if(results.Any(i => i == 3)) 
    {
        Console.WriteLine("\nFirst Iteration");
        //return; //Potential for early exit.
    }

    var last = results.Last(); // Causes multiple enumeration in naive case.        

    Console.WriteLine("\nSecond Iteration");
}

IEnumerable<int> GetValuesComplex()
{
    for (int i = 0; i < 5; i++)
    {
        //... complex operations ...        
        Console.Write(i + ", ");
        yield return i;
    }
}
  • Naive: ✔ Deferred, ✘ Single enumeration.
  • ToList: ✘ Deferred, ✔ Single enumeration.
  • Memoize: ✔ Deferred, ✔ Single enumeration.

.

Edited to use the proper terminology and flesh out the implementation.

like image 128
Andrew Hanlon Avatar answered Oct 19 '22 16:10

Andrew Hanlon


You can solve the incorrect result issue by simply turning your method into iterator:

double culmulativeSum = 0;
var query = myStuff.Where(...)
       .OrderBy(...)
       .TakeWhile(...);
foreach (var item in query) yield return item;

It can be encapsulated in a simple extension method:

public static class Iterators
{
    public static IEnumerable<T> Lazy<T>(Func<IEnumerable<T>> source)
    {
        foreach (var item in source())
            yield return item;
    }
}

Then all you need to do in such scenarios is to surround the original method body with Iterators.Lazy call, e.g.:

return Iterators.Lazy(() =>
{
    double culmulativeSum = 0;
    return myStuff.Where(...)
           .OrderBy(...)
           .TakeWhile(...);
});
like image 32
Ivan Stoev Avatar answered Oct 19 '22 15:10

Ivan Stoev