Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XSLT vs. XMLDocument Data Type -> cycling through nodes with foreach -> appending to a Literal Control

I have a question regarding styling XML in C# using the .NET framework. I'm working on a web site that has 99% of it's data stored as XML. Quite often, I'm stuck to either using an XSLT to transform this XML, or reading the XML into an XMLDocument data type, parsing through it using XPATH, and basically appending the output into Literal server controls throughout the page to handle the display. I have a feeling that the latter is much more expensive in terms of memory, as I'm reading the XML into a data type and looping through the nodes with foreach statements and other logical statements. The only reason I typically do this is because I'm much more comfortable doing it this way, and because of I am by no means an XSL guru, it's the only way I can get done what I need to do. I guess I wanted to know if anyone knows just how much more expensive it is doing it that way? And what factors might influence that besides the size of the xml I'm parsing through. Thank you in advance.

like image 574
user1124164 Avatar asked Jan 17 '23 22:01

user1124164


1 Answers

It has been shown before, as in Jon Bentley's classic book that a bad algorithm implemented in Assembler executes many thousands of times slower than a good algorithm implemented in Basic.

Therefore, it is simply incorrect to say that "technology A is faster than technology B".

As for the memory consumed, it is largely dependent on the size of the XML document(s). Both XmlDocument and a typical XSLT 1.0 or 2.0 processor build a representation of a complete XML document in memory, therefore working with multi-gigabyte XML documents is problematic in both cases.

XSLT 3.0 (still in a Working Draft status) has a streaming feature that allows to process an XML document in streaming mode provided the XPath expressions used in the transformation adhere to certain restrictions.

I believe that it is useful to consider a set of metrics vs only a single one.

XSLT is a language designed especially for processing tree stuctures and as such it offers valuable features that aren't present in other languages. Two examples are templates and pattern matching.

Using templates and pattern matching it is possible to express a transformation declaratively. The code is much more simpler, shorter, easier to understand and maintain, extensible. Writing XSLT code typically can be done in minutes compared to many hours of coding procedural, spagheti - like code in a procedural language.

Debugging a program in a functional language and even proof of correctness is much easier due to the fact that variables are immutable.

Due to the same reason there are much greater possibilities of very aggressive optimization by the XSLT processor.

Finally, let me rebute a statement made in the otherwise good answer of @dash:

The xslt can effectively be the output HTML.

On the other hand, with 2 especially, you are then losing out on many of the great features that asp.net can offer, from web forms through MVC. Given that you eventually want to populate an asp.net server control, the way you are doing it is fine, as otherwise you are running xslt just to get values out

Actually, there is a style of writing XSLT, where there is a strict and clean separation between content and processing (I call this the fill-in the blanks pattern).

See for example, my answer to this question.

Here are some advantages of developing an XSLT application with this approach:

  1. This code can populate any rendering document (path passed as an external parameter) using the data from any data document (again path passed as an external parameter). Thus it becomes possible to create different outputs/formats populated with different data.

  2. The placeholders (gen:data elements) to be replaced with "live content" can have different format and semantics -- no limits to one's imagination.

  3. Editors (non-XSLT experts) can work on one or more rendering documents independently from each other and from the XSLT developers.

  4. A higher degree of reusability, flexibility and maintainability is achieved.

To summarize: The approach of transforming XML with a DOM-based procedural language, while possible, is more costly in terms of development resources and results in lesser quality in terms of code complexity, understandability, maintainability and extensibility.

like image 200
Dimitre Novatchev Avatar answered Feb 08 '23 13:02

Dimitre Novatchev