Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the best way to parse large XML (size of 1GB) in C#?

I have a 1GB XML file and want to parse it. If I use XML Textreader or XMLDocument, the result is very slow and some times it hangs...

like image 909
sivaramakrishna Avatar asked Jan 22 '09 12:01

sivaramakrishna


5 Answers

You'll have to implement custom logic using xmlreader. xmlreader does not load the full XML into memory before using it, which means you can read it from a stream and process it as such.

like image 111
Spence Avatar answered Nov 06 '22 14:11

Spence


XmlDocument is not feasible in this scenario as it will attempt to suck that gigabyte into main memory. I'm surprised that you're finding XmlTextReader to be too slow. Have you tried something like the following?

using (XmlTextReader rdr = new XmlTextReader("MyBigFile.txt"))
{
     // use rdr to advance through the document.
}
like image 31
John Källén Avatar answered Nov 06 '22 15:11

John Källén


XMLTextreader isn't supposed to hang as it's stream based and just works on chunks of the data.

If it hangs, it may well be that you are doing something wrong when loading the file.

like image 6
pilif Avatar answered Nov 06 '22 16:11

pilif


I'm not very familiar with this topic, but afaik the XmlReader-classes ought to work fine for your specific problem. They are, after all, optimized for exactly this.

like image 1
mafu Avatar answered Nov 06 '22 14:11

mafu


I would just like to back up everyone who promotes XmlReader with a performance comparison that I found:

http://www.nearinfinity.com/blogs/joe_ferner/performance_linq_to_sql_vs.html

like image 1
Presidenten Avatar answered Nov 06 '22 16:11

Presidenten