Update: Here's a similar question
Suppose I have a DataTable
with a few thousand DataRows
in it.
I'd like to break up the table into chunks of smaller rows for processing.
I thought C#3's improved ability to work with data might help.
This is the skeleton I have so far:
DataTable Table = GetTonsOfData();
// Chunks should be any IEnumerable<Chunk> type
var Chunks = ChunkifyTableIntoSmallerChunksSomehow; // ** help here! **
foreach(var Chunk in Chunks)
{
// Chunk should be any IEnumerable<DataRow> type
ProcessChunk(Chunk);
}
Any suggestions on what should replace ChunkifyTableIntoSmallerChunksSomehow
?
I'm really interested in how someone would do this with access C#3 tools. If attempting to apply these tools is inappropriate, please explain!
Update 3 (revised chunking as I really want tables, not ienumerables; going with an extension method--thanks Jacob):
Final implementation:
Extension method to handle the chunking:
public static class HarenExtensions
{
public static IEnumerable<DataTable> Chunkify(this DataTable table, int chunkSize)
{
for (int i = 0; i < table.Rows.Count; i += chunkSize)
{
DataTable Chunk = table.Clone();
foreach (DataRow Row in table.Select().Skip(i).Take(chunkSize))
{
Chunk.ImportRow(Row);
}
yield return Chunk;
}
}
}
Example consumer of that extension method, with sample output from an ad hoc test:
class Program
{
static void Main(string[] args)
{
DataTable Table = GetTonsOfData();
foreach (DataTable Chunk in Table.Chunkify(100))
{
Console.WriteLine("{0} - {1}", Chunk.Rows[0][0], Chunk.Rows[Chunk.Rows.Count - 1][0]);
}
Console.ReadLine();
}
static DataTable GetTonsOfData()
{
DataTable Table = new DataTable();
Table.Columns.Add(new DataColumn());
for (int i = 0; i < 1000; i++)
{
DataRow Row = Table.NewRow();
Row[0] = i;
Table.Rows.Add(Row);
}
return Table;
}
}
This is quite readable and only iterates through the sequence once, perhaps saving you the rather bad performance characteristics of repeated redundant Skip()
/ Take()
calls:
public IEnumerable<IEnumerable<DataRow>> Chunkify(DataTable table, int size)
{
List<DataRow> chunk = new List<DataRow>(size);
foreach (var row in table.Rows)
{
chunk.Add(row);
if (chunk.Count == size)
{
yield return chunk;
chunk = new List<DataRow>(size);
}
}
if(chunk.Any()) yield return chunk;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With