There are huge files about 100Mb. I want to load them into memory (RAM), process and save somewhere.
At the same time I want that a limit of memory usage exists. Example, 100Mb, to my app don't use more then this memory limit. If the limit is exceeded the file is processed parts.
My understanding of this:
var line = file.ReadLine();
var allowed = true;
while( allowed && line != null )
{
var newObject = new SomeObject( line );
list.add( newObject );
// Checking the memory
allowed = CheckUsedMemory();
line = file.ReadLine()
}
How to limit the use of RAM? How to implement the CheckUsedMemory method? Thank you.
UPD
Thank you everybody for good advices.
First, thanks for being aware of your memory consumption. If only more programmers were so considerate..
Second, I wouldn't bother: perhaps the user wants your application to run as fast as possible and is willing to burn 8000 megs of memory to get results 5% faster. Let them. :)
But, artificially limiting the amount of memory your application takes may drastically increase processing time, if you force more disk-accesses in the process. If someone is running on a memory-constrained system, they are liable to already have disk traffic for swapping -- if you are artificially dumping memory before you're really finished with it, you're only contributing further to disk IO, getting in the way of the swapping. Let the OS handle this situation.
And lastly, the access pattern you've written here (sequential, line-at-a-time) is very common, and doubtless the .NET designers have put huge amounts of effort into getting memory usage from this pattern to the bare minimum. Adding objects to your internal trees in parts is a nice idea, but very few applications can really benefit from this. (Merge sorting is one excellent application that benefits greatly from partial processing.)
Depending upon what you're doing with your finished list of objects, you might not be able to improve upon working with the entire list at once. OR, you might benefit greatly from breaking it apart. (If Map Reduce describes your data processing problem well, then maybe you would benefit from breaking things apart.)
In any event, I'd be a little leery of using "memory" as the benchmark for deciding when to break apart processing: I'd rather use "1000 lines of input" or "ten levels of nesting" or "ran machine tools for five minutes" or something that is based on the input, rather than the secondary effect of memory consumed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With