Separate functions into validation and implementation? Why?

Question

I was reading a C# book in which the author (some dude named Jon Skeet) implements a Where function like

public static IEnumerable<T> Where<T> ( this IEnumerable<T> source, Funct<T,bool> predicate ) 
{
    if ( source == null || predicate == null ) 
    {
        throw new ArgumentNullException();
    }
    return WhereImpl(source, predicate);
}

public static IEnumerable<T> WhereImpl<T> ( IEnumerable <T> source, Func<T,bool> predicate ) 
{
    foreach ( T item in source ) 
    {
      if ( predicate(item) )  
      {
         yield return item;
      }
    }

}

Now, I fully understand how this works and that it's equivalent to

public static IEnumerable<T> Where<T> ( this IEnumerable<T> source, Funct<T,bool> predicate ) 
{
    if ( source == null || predicate == null ) 
    {
        throw new ArgumentNullException();
    }
    foreach ( T item in source ) 
    {
      if ( predicate(item) )  
      {
         yield return item;
      }
    }
}

which brings up the question of why would one separate these into 2 functions given that there would be memory/time overhead and of course more code. I always validate parameters and if I start writing like this example then I'll be writing twice as much code. Is there some school of thought which holds that validation and implementation should be separate functions?

Jon Skeet · Accepted Answer

The reason is that the iterator block is always lazy. Unless you call GetEnumerator() and then MoveNext(), the code in the method won't get executed.

In other words, consider this call to your "equivalent" method:

var ignored = OtherEnumerable.Where<string>(null, null);

No exception is thrown, because you're not calling GetEnumerator() and then MoveNext(). Compare that with my version where the exception is thrown immediately regardless of how the return value is used... because it only calls the method with the iterator block after validating eagerly.

Note that async/await has similar issues - if you have:

public async Task FooAsync(string x)
{
    if (x == null)
    {
        throw new ArgumentNullException(nameof(x));
    }
    // Do some stuff including awaiting
}

If you call this, you'll end up getting a faulted Task - rather than a NullReferenceException being thrown. If you await the returned Task, then the exception will be thrown, but that may not be where you called the method. That's okay in most cases, but worth knowing about.

Matías Fidemraizer · Answer

It may depend of the scenario and your coding style. Jon Skeet is absolutely right about why they should be separated when you're using yield to create iterators.

BTW, I thought it could be interesting adding my two cents here: same code using Code Contracts (i.e. design by contract) behaves in a different way.

Pre-conditions aren't part of the iterator block, thus, the following code will throw a contract exception immediately if the whole pre-conditions aren't met:

public static class Test
{
    public static IEnumerable<T> Where<T>(this IEnumerable<T> source, Func<T, bool> predicate)
    {
        Contract.Requires(source != null);
        Contract.Requires(predicate != null);

        foreach (T item in source)
        {
            if (predicate(item))
            {
                yield return item;
            }
        }
    }
}

// This throws a contract exception directly, no need of 
// enumerating the returned enumerable
Test.Where<string>(null, null);

Jakub Lortz · Answer

A method using yield return looks very nice and simple, but if you examine the compiled code you will notice it gets quite complicated.

The compiler generates a new class for you with state machine logic to support the enumeration. For the 2nd Where method, it's around 160 lines of code after decompilation. The actual Where method is compiled to

[IteratorStateMachine(typeof(IterarorTest.<Where>d__0<>))]
public static IEnumerable<T> Where<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
    IterarorTest.<Where>d__0<T> expr_07 = new IterarorTest.<Where>d__0<T>(-2);
    expr_07.<>3__source = source;
    expr_07.<>3__predicate = predicate;
    return expr_07;
}

As you can see, no arguments are checked in this method. It just returns a new iterator.

The arguments are checked in the auto-generated class' MoveNext method (the code is a bit too long to post here).

On the other hand, if you move the yield return to another method, the arguments are checked immediately when you call the Where method - which is the expected behavior here.

Edit

As noticed by Matias Fidemraizer, code contracts also solve the problem - the contract checks are inserted in the Where method

public static IEnumerable<T> Where<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
    __ContractsRuntime.Requires(source != null, null, "source != null");
    __ContractsRuntime.Requires(predicate != null, null, "predicate != null");
    IterarorTest.<Where>d__0<T> expr_27 = new IterarorTest.<Where>d__0<T>(-2);
    expr_27.<>3__source = source;
    expr_27.<>3__predicate = predicate;
    return expr_27;
}

Separate functions into validation and implementation? Why?

Tags:

c#

design-patterns

Subpar Web Dev

3 Answers

Jon Skeet

Matías Fidemraizer

Jakub Lortz

Recent Activity

Donate For Us

Separate functions into validation and implementation? Why?

Tags:

c#

design-patterns

Subpar Web Dev

3 Answers

Jon Skeet

Matías Fidemraizer

Jakub Lortz

Related questions

Recent Activity

Donate For Us