I've got a few huge xml files, 1+ gb. I need to do some filtering operations with them. The easiest idea I've come up with is to save them as txt and ReadAllText from them, and start doing some operations like
var a = File.ReadAllText("file path");
a = a.Replace("<", "\r\n<");
The moment I try to do that, however, the program crashes out of memory. I've looked at my task manager while I run it and the RAM usage climbs to 50% and the moment it reaches it the program dies.
Does anyone have any ideas on how I operate with this file avoiding the OutOfMemory exception or allow the program to pull on more of the memory.
If you can do it line by line, instead of saying "Read everything to memory" with File.ReadAllText
, you can say "Yield me one line at time" with File.ReadLines
.
This will return IEnumerable which uses deferred execution. You can do it like this:
using(StreamWriter sw = new StreamWriter(newFilePath))
foreach(var line in File.ReadLines(path))
{
sw.WriteLine(line.Replace("<", "\r\n<"));
sw.Flush();
}
If you want to learn more about deferred execution, you can check this github page.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With