I have the following extension method to find an element within a sequence, and then return two IEnumerable<T>
s: one containing all the elements before that element, and one containing the element and everything that follows. I would prefer if the method were lazy, but I haven't figured out a way to do that. Can anyone come up with a solution?
public static PartitionTuple<T> Partition<T>(this IEnumerable<T> sequence, Func<T, bool> partition)
{
var a = sequence.ToArray();
return new PartitionTuple<T>
{
Before = a.TakeWhile(v => !partition(v)),
After = a.SkipWhile(v => !partition(v))
};
}
Doing sequence.ToArray()
immediately defeats the laziness requirement. However, without that line, an expensive-to-iterate sequence
may be iterated over twice. And, depending on what the calling code does, many more times.
You can use the Lazy
object to ensure that the source sequence isn't converted to an array until one of the two partitions is iterated:
public static PartitionTuple<T> Partition<T>(
this IEnumerable<T> sequence, Func<T, bool> partition)
{
var lazy = new Lazy<IEnumerable<T>>(() => sequence.ToArray());
return new PartitionTuple<T>
{
Before = lazy.MapLazySequence(s => s.TakeWhile(v => !partition(v))),
After = lazy.MapLazySequence(s => s.SkipWhile(v => !partition(v)))
};
}
We'll use this method to defer evaluating the lazy until the sequence itself is iterated:
public static IEnumerable<TResult> MapLazySequence<TSource, TResult>(
this Lazy<IEnumerable<TSource>> lazy,
Func<IEnumerable<TSource>, IEnumerable<TResult>> filter)
{
foreach (var item in filter(lazy.Value))
yield return item;
}
This is an interesting problem and to get it right, you have to know what "right" is. For the semantics of the operation, I think that this definition makes sense:
I'm not entirely sure I got the handling of the matching object right, but I hope you get the idea. I'm deferring a lot of the work to the PartitionTuple<T>
class to be able to be lazy.
public class PartitionTuple<T>
{
IEnumerable<T> source;
IList<T> before, after;
Func<T, bool> partition;
public PartitionTuple(IEnumerable<T> source, Func<T, bool> partition)
{
this.source = source;
this.partition = partition;
}
private void EnsureMaterialized()
{
if(before == null)
{
before = new List<T>();
after = new List<T>();
using(var enumerator = source.GetEnumerator())
{
while(enumerator.MoveNext() && !partition(enumerator.Current))
{
before.Add(enumerator.Current);
}
while(!partition(enumerator.Current) && enumerator.MoveNext());
while(enumerator.MoveNext())
{
after.Add(enumerator.Current);
}
}
}
}
public IEnumerable<T> Before
{
get
{
EnsureMaterialized();
return before;
}
}
public IEnumerable<T> After
{
get
{
EnsureMaterialized();
return after;
}
}
}
public static class Extensions
{
public static PartitionTuple<T> Partition<T>(this IEnumerable<T> sequence, Func<T, bool> partition)
{
return new PartitionTuple<T>(sequence, partition);
}
}
Here's a generic solution that will memoize any IEnumerable<T>
to ensure it's only iterated once, without forcing the whole thing to iterate:
public class MemoizedEnumerable<T> : IEnumerable<T>, IDisposable
{
private readonly IEnumerator<T> _childEnumerator;
private readonly List<T> _itemCache = new List<T>();
public MemoizedEnumerable(IEnumerable<T> enumerableToMemoize)
{
_childEnumerator = enumerableToMemoize.GetEnumerator();
}
public IEnumerator<T> GetEnumerator()
{
return _itemCache.Concat(EnumerateOnce()).GetEnumerator();
}
public void Dispose()
{
_childEnumerator.Dispose();
}
private IEnumerable<T> EnumerateOnce()
{
while (_childEnumerator.MoveNext())
{
_itemCache.Add(_childEnumerator.Current);
yield return _childEnumerator.Current;
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
public static class EnumerableExtensions
{
public static IEnumerable<T> Memoize<T>(this IEnumerable<T> enumerable)
{
return new MemoizedEnumerable<T>(enumerable);
}
}
To use it for your partitioning problem, do this:
var memoized = sequence.Memoize();
return new PartitionTuple<T>
{
Before = memoized.TakeWhile(v => !partition(v)),
After = memoized.SkipWhile(v => !partition(v))
};
This will only iterate sequence
a maximum of one time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With