Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The most efficient way to parse Xml

The .Net framework now has (at least) four different methods of reading an Xml string. I've used each of XmlDocument, XmlReader, XPath and XElement, but which is the most efficient to use when coding or during execution? Is each designed for a different task, what are the pros and cons?


Update: Using a XmlReader appears to be the quickest way to read xml, which sound reasonable to me, but has it's limitations. I would like to know if there is any performance difference between XmlDocument and XLinq for accessing xml non-sequentially.


Update: I found some posts comparing the different methods of loading an xml document. XmlReader is the fastest, there is insignificant difference between XmlDocument and LINQ to XML until you load a document with 10,000+ node where LINQ to XML comes out in front.

  • http://www.nearinfinity.com/blogs/page/jferner?entry=performance_linq_to_sql_vs
  • http://www.hanselman.com/blog/AtAGlanceXmlReaderVsXPathNavigatorVsXmlDocument.aspx
like image 937
bstoney Avatar asked Mar 03 '09 13:03

bstoney


People also ask

What is the best way to parse XML?

DOM Parser is the easiest java xml parser to learn. DOM parser loads the XML file into memory and we can traverse it node by node to parse the XML.

Which XML parser is faster?

The current latest version as of 01/10/2018 is SAX 2.0. It uses an event-driven serial-access mechanism for accessing XML documents and is frequently used by applets that need to access XML documents because it is the fastest and least memory-consuming API available for parsing XML documents.

Which method is used to parse an XML document?

DOM and SAX are the two standard APIs for processing XML documents. Most major XML parsers support them.


1 Answers

The three most common methods to read are:

XmlDocument It reads the whole file in a tree structure that can then be accessed using XPath or by browsing all the nodes. It requires a lot of memory for very large file since the whole XML structure must be loaded in memory. Very good and simple to use for smaller files (less then a few megs).

XmlReader Fast, but also a real pain to use since it's sequential. If you ever need to go back, you can't, and XML structure are usually very prone to having disorganised orders. Also, if you read from a non ending stream of XML, this is probably the only way to go.

XML serializers This basically does everything for you, you provide the root object of your model and it creates and read the XML for you. However, you have almost no control over the structure, and reading older versions of your object is a pain. So this won't work very well for persistance.

XDocument and LINQ to XML As Daniel Straight pointed out. But I don't know it enough to comment. I invite anyone to edit the post and add the missing info.


Now writing is another story. It's a pain to maintain a XmlDocument and XmlWriter is a breeze to use.

I'd say, from my experience, that the best combo is to write using XmlWriter and read using XmlDocument.

like image 194
3 revs Avatar answered Sep 21 '22 12:09

3 revs