Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

tree parser vs stream parser

Looking for an explanation on tree parser vs stream parser.

From what i have been researching the JSON build-in parser in android is a tree parser and the Jackson Json parser is a stream parser. Also, android's xml pull parser is a stream parser.

My question is what is a tree parser and could you explain the difference between a stream and tree parser? From Google I/O presenter mentioned tree parser hog much more battery life and should be avoided in place of a stream parser.

UPDATE: Is tree parser equal to Dom parser? I mean are the terms the same?

like image 639
j2emanue Avatar asked Oct 03 '22 04:10

j2emanue


1 Answers

A tree parser returns a complete parse of the text. Therefore, it doesn't give an answer until the entire text has been parsed.

In contrast, a stream parser returns information as it is processing the text. It is up to you then to build a tree if you so choose. In algorithms, this difference is the difference between what is called a batch or off-line algorithm (tree parsing) versus an on-line algorithm (stream parser).

See What's the difference between an on-line and off-line algorithm?.

So why would you choose one versus the other? The Google I/O presenter mentioned battery life. But that's a result of the more general principle you need more memory for storing a tree for the entire text and more processing time to read the entire text (assuming the stream parser can quit early).

If you are looking for specific information that uses a small part of the text such as finding the first tag in a DOM or an XML document, then the stream approach is probably the way to go.

If on the other hand you need to find all tags, and tags of various sorts which you might think of as several conceptual passes over the document, or if you'll come back to that text/tree over and over again, then you might want do the parsing once and work off of the resulting tree rather than make several passes over the text.

Similarly if the kind of information you need is best answered by thinking about the problem as a tree: getting or passing information from children nodes, sibling nodes, and/or ancestor nodes, then you'll probably want to go the tree approach. But...

In theory you can always turn a streaming parser into a tree parser by doing the work to build the tree as you go along. And that's extra code you have to write.

The difference between a stream parser and a tree parser is like the difference between a Python iterator/generator versus a list (equivalently a Ruby enumeration versus an array).

like image 67
rocky Avatar answered Oct 12 '22 23:10

rocky