Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast, lightweight XML parser [closed]

Tags:

java

json

xml

dojo

I have a specific format XML document that I will get pushed. This document will always be the same type so it's very strict.

I need to parse this so that I can convert it into JSON (well, a slightly bastardized version so someone else can use it with DOJO).

My question is, shall I use a very fast lightweight (no need for SAX, etc.) XML parser (any ideas?) or write my own, basically converting into a StringBuffer and spinning through the array? Basically, under the covers I assume all HTML parsers will spin thru the string (or memory buffer) and parse, producing output on the way through.

Thanks

edit

The xml will be between 3/4 lines to about 50 max (at the extreme)..

like image 527
joe90 Avatar asked Jan 25 '10 18:01

joe90


People also ask

Which XML parser is faster?

DOM Parser is faster than SAX Parser. Best for the larger sizes of files. Best for the smaller size of files. It is suitable for making XML files in Java.

Is XML easy to parse?

Well parsing XML is not an easy task. Its basic structure is a tree with any node in tree capable of holding a container which consists of an array of more trees.

What are the two types of XML parser?

In PHP there are two major types of XML parsers: Tree-Based Parsers. Event-Based Parsers.


1 Answers

No, you should not try to write your own XML parser for this.

SAX itself is very lightweight and fast, so I'm not sure why think it's too much. Also using a string buffer would actually be much less scalable then using SAX because SAX doesn't require you to load the whole XML file into memory to use it. I've used SAX to parse through multigigabyte XML files, which you wouldn't be able to do using string buffers on a 32 bit machine.

If you have small files and you don't need to worry about performance, look into using the DOM. Java's implementation can be kind of annoying to use (You create a document by using a DocumentBuilder, which comes from a DocumentBuilderFactory)

The code to create a document from a file looks like this:

Document d = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new FileInputStream("file.xml"));

(note that keeping a reference to your document builder will speed things up if you need to parse multiple files)

Then you use the function in org.w3c.dom.Document to read or manipulate the contents. For example getElementsByTagName() returns all the Elements with a certain tag name.

like image 79
Chad Okere Avatar answered Sep 21 '22 21:09

Chad Okere