What is the difference between SAX and DOM?

People also ask

What is DOM and SAX in PHP?

DOM stands for Document Object Model. SAX stands for the Simple API for XML parsing. DOM needs a lot of memory. SAX does not need a lot of memory. It is memory inefficient.

Why is SAX parser faster than DOM?

SAX is faster than DOM (usually felt when reading large XML document) because SAX gives you information as a sequence of events (usually accessed through a handler) while DOM creates Nodes and manages the node creation structure until a DOM tree is fully created (as represented in the XML document).

What is SAX in HTML?

SAX (Simple API for XML) is an event-driven online algorithm for parsing XML documents, with an API developed by the XML-DEV mailing list. SAX provides a mechanism for reading data from an XML document that is an alternative to that provided by the Document Object Model (DOM).

What advantages does a SAX parser have over a DOM parser?

1)SAX is faster than DOM. 2)SAX is good for large documents because it takes comparitively less memory than Dom. 3)SAX takes less time to read a document where as Dom takes more time. 4)With SAX we can access data but we can't modify data.

Well, you are close.

In SAX, events are triggered when the XML is being parsed. When the parser is parsing the XML, and encounters a tag starting (e.g. <something>), then it triggers the tagStarted event (actual name of event might differ). Similarly when the end of the tag is met while parsing (</something>), it triggers tagEnded. Using a SAX parser implies you need to handle these events and make sense of the data returned with each event.

In DOM, there are no events triggered while parsing. The entire XML is parsed and a DOM tree (of the nodes in the XML) is generated and returned. Once parsed, the user can navigate the tree to access the various data previously embedded in the various nodes in the XML.

In general, DOM is easier to use but has an overhead of parsing the entire XML before you can start using it.

In just a few words...

SAX (Simple API for XML): Is a stream-based processor. You only have a tiny part in memory at any time and you "sniff" the XML stream by implementing callback code for events like tagStarted() etc. It uses almost no memory, but you can't do "DOM" stuff, like use xpath or traverse trees.

DOM (Document Object Model): You load the whole thing into memory - it's a massive memory hog. You can blow memory with even medium sized documents. But you can use xpath and traverse the tree etc.

Here in simpler words:

DOM

Tree model parser (Object based) (Tree of nodes).
DOM loads the file into the memory and then parse- the file.
Has memory constraints since it loads the whole XML file before parsing.
DOM is read and write (can insert or delete nodes).
If the XML content is small, then prefer DOM parser.
Backward and forward search is possible for searching the tags and evaluation of the information inside the tags. So this gives the ease of navigation.
Slower at run time.

SAX

Event based parser (Sequence of events).
SAX parses the file as it reads it, i.e. parses node by node.
No memory constraints as it does not store the XML content in the memory.
SAX is read only i.e. can’t insert or delete the node.
Use SAX parser when memory content is large.
SAX reads the XML file from top to bottom and backward navigation is not possible.
Faster at run time.

You are correct in your understanding of the DOM based model. The XML file will be loaded as a whole and all its contents will be built as an in-memory representation of the tree the document represents. This can be time- and memory-consuming, depending on how large the input file is. The benefit of this approach is that you can easily query any part of the document, and freely manipulate all the nodes in the tree.

The DOM approach is typically used for small XML structures (where small depends on how much horsepower and memory your platform has) that may need to be modified and queried in different ways once they have been loaded.

SAX on the other hand is designed to handle XML input of virtually any size. Instead of the XML framework doing the hard work for you in figuring out the structure of the document and preparing potentially lots of objects for all the nodes, attributes etc., SAX completely leaves that to you.

What it basically does is read the input from the top and invoke callback methods you provide when certain "events" occur. An event might be hitting an opening tag, an attribute in the tag, finding text inside an element or coming across an end-tag.

SAX stubbornly reads the input and tells you what it sees in this fashion. It is up to you to maintain all state-information you require. Usually this means you will build up some sort of state-machine.

While this approach to XML processing is a lot more tedious, it can be very powerful, too. Imagine you want to just extract the titles of news articles from a blog feed. If you read this XML using DOM it would load all the article contents, all the images etc. that are contained in the XML into memory, even though you are not even interested in it.

With SAX you can just check if the element name is (e. g.) "title" whenever your "startTag" event method is called. If so, you know that you needs to add whatever the next "elementText" event offers you. When you receive the "endTag" event call, you check again if this is the closing element of the "title". After that, you just ignore all further elements, until either the input ends, or another "startTag" with a name of "title" comes along. And so on...

You could read through megabytes and megabytes of XML this way, just extracting the tiny amount of data you need.

The negative side of this approach is of course, that you need to do a lot more book-keeping yourself, depending on what data you need to extract and how complicated the XML structure is. Furthermore, you naturally cannot modify the structure of the XML tree, because you never have it in hand as a whole.

So in general, SAX is suitable for combing through potentially large amounts of data you receive with a specific "query" in mind, but need not modify, while DOM is more aimed at giving you full flexibility in changing structure and contents, at the expense of higher resource demand.

Related questions
                            
                                How to use sed to extract substring
                            
                                Why do I receive a DMARC report everyday? [closed]
                            
                                UnicodeEncodeError: 'ascii' codec can't encode character u'\xef' in position 0: ordinal not in range(128)
                            
                                Apostrophe not preceded by \
                            
                                How to read and write XML files?
                            
                                How to parse XML using vba
                            
                                Groovy XmlSlurper vs XmlParser
                            
                                When should I choose SAX over StAX?
                            
                                xml.LoadData - Data at the root level is invalid. Line 1, position 1
                            
                                Read a XML (from a string) and get some fields - Problems reading XML
                            
                                jQuery XML error ' No 'Access-Control-Allow-Origin' header is present on the requested resource.'
                            
                                Are line breaks in XML attribute values allowed?
                            
                                The reference to entity "foo" must end with the ';' delimiter
                            
                                SyntaxError of Non-ASCII character [duplicate]
                            
                                Is XSLT worth it? [closed]
                            
                                Robust and Mature HTML Parser for PHP [duplicate]
                            
                                Best XML Parser for PHP [duplicate]
                            
                                Parsing XML with namespace in Python via 'ElementTree'
                            
                                Convert XML String to Object
                            
                                Parsing HTML using Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the difference between SAX and DOM?

Tags:

xml-parsing

saxparser

domparser

People also ask

Recent Activity

Donate For Us