Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best XML handling class in Java [closed]

Tags:

java

xml

Which is the best class in Java to work with XML documents?

like image 860
Ajay Avatar asked Aug 28 '09 09:08

Ajay


People also ask

What is the best way to parse XML in Java?

DOM Parser is the easiest java xml parser to learn. DOM parser loads the XML file into memory and we can traverse it node by node to parse the XML. DOM Parser is good for small files but when file size increases it performs slow and consumes more memory.

Is Jaxb memory efficient?

Generally JAXB is quite efficient and you shouldn't care about memory issues unless your application handles XMLs of very large size.

Which class is used to load XML?

You can load XML into the DOM by using the XmlDocument class, and then programmatically read, modify, and remove XML in the document.

Which XML Parser is more memory efficient?

The SAX event-based parser is faster and consumes far less memory than the DOM parser; consequently, it allows developers to parse the data out of an XML document more effectively.


3 Answers

It really depends on what you want to do with the XML document and how big the documents are.

Roughly, you can categorise XML APIs as:

  • DOM APIs - load the entire document into memory, which limits the size of document you can process, but can then create optimised structures for navigation and transformation
  • Streaming APIs - your application must interpret low level parse events (e.g. start of element, end of element, etc.) but you are not limited by memory. There are two kinds of streaming API - push and pull. Push parsers fire parse events at an object you define and that object must keep track of the current parse state, with a state machine or stack, for example). Pull parsers let your app pull parse events from the parser. This makes it easy to write a recursive descent parser to process the XML content, but then stack size becomes a limit on the size of document you can process.
  • XML Mappers - map XML content to Java objects. There are two main approaches for XML mapping: code-gen or reflection. Code-gen mappers generate Java classes from an XML schema, which means you don't have to duplicate the schema structure in Java code but does have the disadvantage that your Java code exactly mirrors the schema structure. Also most code generators create NOJO classes that are awkward to work with and have no behaviour of their own. Reflective mappers let you to write Java classes with rich behaviour and then define how they are mapped to/from XML. If you need to conform to a predefined schema, you'll have to make sure your classes and mapping configuration are correct w.r.t. that schema.

Some options available are:

  • DOM APIs: The DOM APIs in the standard library are standard (obviously!) and so interoperate with other libraries but they are awful. There are several more convenient DOM-like APIs, such as XOM (my favourite for the same reasons that Adam Batkin gives above) or JDOM. Have a look at a few and decide which API you prefer.
  • Streaming APIs: the standard library contains an implementation of the SAX push parser. The standard pull parser for Java is StAX.
  • Mapping APIs: JAXB is a JSR standard but I prefer XStream because I can more easily separate the mapping configuration from the mapped classes (no need for annotations or XML configuration) and it maps objects to/from other data formats.
like image 98
Nat Avatar answered Sep 27 '22 19:09

Nat


I find dom4j to come out on top of anything else I've used (especially JDOM, which I find to have a particularly poor API). dom4j allows to plug in Jaxen for XPath support as well.

Examples:

   SAXReader reader = new SAXReader(); // dom4j SAXReader  
   Document document = reader.read(xmlInputStream); // dom4j Document  

   // select all link nodes with href "http://example.com"  
   List<Element> linkNodes = document.selectNodes("//link[@href='http://example.com']");  

   // select an attribute value  
   String val = linkNodes.get(0).attributeValue("href");  

   // select element text and trim it  
   String value = document.elementTextTrim("childNode");  
like image 22
Matthias Avatar answered Sep 27 '22 20:09

Matthias


I think it's JDOM for ease of use.

like image 38
duffymo Avatar answered Sep 27 '22 20:09

duffymo