I was reading a C# book in which the author (some dude named Jon Skeet) implements a Where
function like
public static IEnumerable<T> Where<T> ( this IEnumerable<T> source, Funct<T,bool> predicate )
{
if ( source == null || predicate == null )
{
throw new ArgumentNullException();
}
return WhereImpl(source, predicate);
}
public static IEnumerable<T> WhereImpl<T> ( IEnumerable <T> source, Func<T,bool> predicate )
{
foreach ( T item in source )
{
if ( predicate(item) )
{
yield return item;
}
}
}
Now, I fully understand how this works and that it's equivalent to
public static IEnumerable<T> Where<T> ( this IEnumerable<T> source, Funct<T,bool> predicate )
{
if ( source == null || predicate == null )
{
throw new ArgumentNullException();
}
foreach ( T item in source )
{
if ( predicate(item) )
{
yield return item;
}
}
}
which brings up the question of why would one separate these into 2 functions given that there would be memory/time overhead and of course more code. I always validate parameters and if I start writing like this example then I'll be writing twice as much code. Is there some school of thought which holds that validation and implementation should be separate functions?
The reason is that the iterator block is always lazy. Unless you call GetEnumerator()
and then MoveNext()
, the code in the method won't get executed.
In other words, consider this call to your "equivalent" method:
var ignored = OtherEnumerable.Where<string>(null, null);
No exception is thrown, because you're not calling GetEnumerator()
and then MoveNext()
. Compare that with my version where the exception is thrown immediately regardless of how the return value is used... because it only calls the method with the iterator block after validating eagerly.
Note that async/await has similar issues - if you have:
public async Task FooAsync(string x)
{
if (x == null)
{
throw new ArgumentNullException(nameof(x));
}
// Do some stuff including awaiting
}
If you call this, you'll end up getting a faulted Task
- rather than a NullReferenceException
being thrown. If you await the returned Task
, then the exception will be thrown, but that may not be where you called the method. That's okay in most cases, but worth knowing about.
It may depend of the scenario and your coding style. Jon Skeet is absolutely right about why they should be separated when you're using yield
to create iterators.
BTW, I thought it could be interesting adding my two cents here: same code using Code Contracts (i.e. design by contract) behaves in a different way.
Pre-conditions aren't part of the iterator block, thus, the following code will throw a contract exception immediately if the whole pre-conditions aren't met:
public static class Test
{
public static IEnumerable<T> Where<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
Contract.Requires(source != null);
Contract.Requires(predicate != null);
foreach (T item in source)
{
if (predicate(item))
{
yield return item;
}
}
}
}
// This throws a contract exception directly, no need of
// enumerating the returned enumerable
Test.Where<string>(null, null);
A method using yield return
looks very nice and simple, but if you examine the compiled code you will notice it gets quite complicated.
The compiler generates a new class for you with state machine logic to support the enumeration. For the 2nd Where
method, it's around 160 lines of code after decompilation. The actual Where
method is compiled to
[IteratorStateMachine(typeof(IterarorTest.<Where>d__0<>))]
public static IEnumerable<T> Where<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
IterarorTest.<Where>d__0<T> expr_07 = new IterarorTest.<Where>d__0<T>(-2);
expr_07.<>3__source = source;
expr_07.<>3__predicate = predicate;
return expr_07;
}
As you can see, no arguments are checked in this method. It just returns a new iterator.
The arguments are checked in the auto-generated class' MoveNext
method (the code is a bit too long to post here).
On the other hand, if you move the yield return
to another method, the arguments are checked immediately when you call the Where
method - which is the expected behavior here.
Edit
As noticed by Matias Fidemraizer, code contracts also solve the problem - the contract checks are inserted in the Where
method
public static IEnumerable<T> Where<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
__ContractsRuntime.Requires(source != null, null, "source != null");
__ContractsRuntime.Requires(predicate != null, null, "predicate != null");
IterarorTest.<Where>d__0<T> expr_27 = new IterarorTest.<Where>d__0<T>(-2);
expr_27.<>3__source = source;
expr_27.<>3__predicate = predicate;
return expr_27;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With