I am developing a C# program which has an "IEnumerable users" that stores the ids of 4 million users. I need to loop through the IEnumerable and extract a batch 1000 ids each time to perform some operations in another method.
How do I extract 1000 ids at a time from start of the IEnumerable, do some thing else, then fetch the next batch of 1000 and so on?
Is this possible?
You can use MoreLINQ's Batch operator (available from NuGet):
foreach(IEnumerable<User> batch in users.Batch(1000)) // use batch
If simple usage of library is not an option, you can reuse implementation:
public static IEnumerable<IEnumerable<T>> Batch<T>( this IEnumerable<T> source, int size) { T[] bucket = null; var count = 0; foreach (var item in source) { if (bucket == null) bucket = new T[size]; bucket[count++] = item; if (count != size) continue; yield return bucket.Select(x => x); bucket = null; count = 0; } // Return the last bucket with all remaining elements if (bucket != null && count > 0) { Array.Resize(ref bucket, count); yield return bucket.Select(x => x); } }
BTW for performance you can simply return bucket without calling Select(x => x)
. Select is optimized for arrays, but selector delegate still would be invoked on each item. So, in your case it's better to use
yield return bucket;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With