What is more efficient for parsing Xml, XPath with XmlDocuments, XSLT or Linq?

1 Answers

The absolute fastest way to query an XML document is the hardest: write a method that uses an XmlReader to process the input stream, and have it process nodes as it reads them. This is the way to combine parsing and querying into a single operation. (Simply using XPath doesn't do this; both XmlDocument and XPathDocument parse the document in their Load methods.) This is usually only a good idea if you're processing extremely large streams of XML data.

All three methods you've describe perform similarly. XSLT has a lot of room to be the slowest of the lot, because it lets you combine the inefficiencies of XPath with the inefficiencies of template matching. XPath and LINQ queries both do essentially the same thing, which is linear searching through enumerable lists of XML nodes. I would expect LINQ to be marginally faster in practice because XPath is interpreted at runtime while LINQ is interpreted at compile-time.

But in general, how you write your query is going to have a much greater impact on execution speed than what technology you use.

The way to write fast queries against XML documents is the same whether you're using XPath or LINQ: formulate the query so that as few nodes as possible get visited during its execution. It doesn't matter which technology you use: a query that examines every node in the document is going to run a lot slower than one that examines only a small subset of them. Your ability to do that is more dependent on the structure of the XML than anything else: a document with a navigable hierarchy of elements is generally going to be a lot faster to query than one whose elements are all children of the document element.

Edit:

While I'm pretty sure I'm right that the absolute fastest way to query an XML is the hardest, the real fastest (and hardest) way doesn't use an XmlReader; it uses a state machine that directly processes characters from a stream. Like parsing XML with regular expressions, this is ordinarily a terrible idea. But it does give you the option of exchanging features for speed. By deciding not to handle those pieces of XML that you don't need for your application (e.g. namespace resolution, expansion of character entities, etc.) you can build something that will seek through a stream of characters faster than an XmlReader would. I can think of applications where this is even not a bad idea, though there I can't think of many.

157

answered Nov 08 '22 22:11

Robert Rossney

Related questions
                            
                                How do I find the return type of a method with System.Reflection.MethodBase in C#?
                            
                                How DataReader works?
                            
                                Calling a Javascript function in the C# webBrowser control
                            
                                How to set a breakpoint on a method within the .net framework
                            
                                Building a LINQ expression tree: how to get variable in scope
                            
                                C# Console? [closed]
                            
                                What are Navigation Properties in Entity Framework for?
                            
                                What 'length' parameter should I pass to SqlDataReader.GetBytes()
                            
                                Asp.Net Webforms Vs Asp.Net WebSite(Razor) Vs Asp.Net MVC
                            
                                Multiple Calls to HttpContent ReadAsAsync
                            
                                HEAD with WebClient?
                            
                                How to make DebugView work under .NET 4?
                            
                                How do I get the n-th element in a LinkedList<T>?
                            
                                Why is "Divide by Zero" or any other exception not raised?
                            
                                Where is the EDMX
                            
                                In C#, can I use reflection to determine if an enum type is int, byte, short, etc?
                            
                                NHibernate DuplicateMappingException when two classes have the same name but different namespaces
                            
                                How to declare a generic delegate with an out parameter [duplicate]
                            
                                C# Is there an Exception overview?
                            
                                Where is the location of GAC?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is more efficient for parsing Xml, XPath with XmlDocuments, XSLT or Linq?

Tags:

.net

xml

linq

xslt

xpath

Andy McCluggage

People also ask

1 Answers

Robert Rossney

Recent Activity

Donate For Us