Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Roll your own XML parser / XML parsing algorithm?

Tags:

So, just as a fun project, I decided I'd write my own XML parser. No, not to parse a specific document, and no, not using an XML parser library. I mean writing code to parse out any XML document into a usable data structure. Just because I like the challenge. :-)

With that said, so far it's proved to be... interesting. It's not as easy to parse (especially when you start taking into account special characters, CDATA, empty tags, comments, etc.) as it initially looked.

Are there any well documented XML parsing algorithms or explanations anywhere that anyone knows of? It seems like there are well-documented Queue and Stack and BTree and etc. etc. etc. implementations everywhere, but I'm not sure I've ever seen a simple, well-documented XML parser algorithm...

I repeat: I am not looking for a pre-built parser library! I am looking for information on how to create my own pre-built parser library! Do not tell me "use expat" or "use SAX" or whatever. That's not what I'm asking for.

like image 505
Keith Palmer Jr. Avatar asked Dec 16 '09 20:12

Keith Palmer Jr.


People also ask

How XML parser uses tree algorithms?

XML parsing is performed a core operation on an XML document for it to be accessed and manipulated. Using the tree branch symbiosis algorithms XML documents are parsed the document and its elements are stored in a single table of database. It is not necessary to read the according to their hierarchical structure.

What are the two methods of parsing in XML document?

Following are the various types of parsers which are commonly used to parse XML documents. Dom Parser − Parses an XML document by loading the complete contents of the document and creating its complete hierarchical tree in memory. SAX Parser − Parses an XML document on event-based triggers.

What is the best way to parse XML in Java?

Java XML Parser - DOM DOM Parser is the easiest java xml parser to learn. DOM parser loads the XML file into memory and we can traverse it node by node to parse the XML. DOM Parser is good for small files but when file size increases it performs slow and consumes more memory.

What is XML explain parsing of XML with an example?

The XML DOM (Document Object Model) defines the properties and methods for accessing and editing XML. However, before an XML document can be accessed, it must be loaded into an XML DOM object. All modern browsers have a built-in XML parser that can convert text into an XML DOM object.


1 Answers

Antlr offers a tutorial on parsing XML. It breaks the process down into phases: lexing, parsing, tree parsing, etc. Looks pretty interesting.

like image 170
Corbin March Avatar answered Oct 03 '22 22:10

Corbin March