When I have to get GBs of data, save it on a collection and process it, I have memory overflows. So instead of:
public class Program
{
public IEnumerable<SomeClass> GetObjects()
{
var list = new List<SomeClass>();
while( // get implementation
list.Add(object);
}
return list;
}
public void ProcessObjects(IEnumerable<SomeClass> objects)
{
foreach(var object in objects)
// process implementation
}
void Main()
{
var objects = GetObjects();
ProcessObjects(objects);
}
}
I need to:
public class Program
{
void ProcessObject(SomeClass object)
{
// process implementation
}
public void GetAndProcessObjects()
{
var list = new List<SomeClass>();
while( // get implementation
Process(object);
}
return list;
}
void Main()
{
var objects = GetAndProcessObjects();
}
}
There is a better way?
You ought to leverage C#'s iterator blocks and use the yield return
statement to do something like this:
public class Program
{
public IEnumerable<SomeClass> GetObjects()
{
while( // get implementation
yield return object;
}
}
public void ProcessObjects(IEnumerable<SomeClass> objects)
{
foreach(var object in objects)
// process implementation
}
void Main()
{
var objects = GetObjects();
ProcessObjects(objects);
}
}
This would allow you to stream each object and not keep the entire sequence in memory - you would only need to keep one object in memory at a time.
Don't use a List, which requires all the data to be present in memory at once. Use IEnumerable<T>
and produce the data on demand, or better, use IQueryable<T>
and have the entire execution of the query deferred until the data are required.
Alternatively, don't keep the data in memory at all, but rather save the data to a database for processing. When processing is complete, then query the database for the results.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With