Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pretty print XML in java 8

I have an XML file stored as a DOM Document and I would like to pretty print it to the console, preferably without using an external library. I am aware that this question has been asked multiple times on this site, however none of the previous answers have worked for me. I am using java 8, so perhaps this is where my code differs from previous questions? I have also tried to set the transformer manually using code found from the web, however this just caused a not found error.

Here is my code which currently just outputs each xml element on a new line to the left of the console.

import java.io.*; import javax.xml.parsers.*; import javax.xml.transform.*; import javax.xml.transform.dom.DOMSource; import javax.xml.transform.stream.StreamResult;  import org.w3c.dom.Document; import org.xml.sax.InputSource; import org.xml.sax.SAXException;   public class Test {     public Test(){         try {             //java.lang.System.setProperty("javax.xml.transform.TransformerFactory", "org.apache.xalan.xsltc.trax.TransformerFactoryImpl");              DocumentBuilderFactory dbFactory;             DocumentBuilder dBuilder;             Document original = null;             try {                 dbFactory = DocumentBuilderFactory.newInstance();                 dBuilder = dbFactory.newDocumentBuilder();                 original = dBuilder.parse(new InputSource(new InputStreamReader(new FileInputStream("xml Store - Copy.xml"))));             } catch (SAXException | IOException | ParserConfigurationException e) {                 e.printStackTrace();             }             StringWriter stringWriter = new StringWriter();             StreamResult xmlOutput = new StreamResult(stringWriter);             TransformerFactory tf = TransformerFactory.newInstance();             //tf.setAttribute("indent-number", 2);             Transformer transformer = tf.newTransformer();             transformer.setOutputProperty(OutputKeys.METHOD, "xml");             transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");             transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");             transformer.setOutputProperty(OutputKeys.INDENT, "yes");             transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");             transformer.transform(new DOMSource(original), xmlOutput);             java.lang.System.out.println(xmlOutput.getWriter().toString());         } catch (Exception ex) {             throw new RuntimeException("Error converting to String", ex);         }     }      public static void main(String[] args){         new Test();     }  } 
like image 950
Hungry Avatar asked Sep 16 '14 08:09

Hungry


Video Answer


1 Answers

In reply to Espinosa's comment, here is a solution when "the original xml is not already (partially) indented or contain new lines".

Background

Excerpt from the article (see References below) inspiring this solution:

Based on the DOM specification, whitespaces outside the tags are perfectly valid and they are properly preserved. To remove them, we can use XPath’s normalize-space to locate all the whitespace nodes and remove them first.

Java Code

public static String toPrettyString(String xml, int indent) {     try {         // Turn xml string into a document         Document document = DocumentBuilderFactory.newInstance()                 .newDocumentBuilder()                 .parse(new InputSource(new ByteArrayInputStream(xml.getBytes("utf-8"))));          // Remove whitespaces outside tags         document.normalize();         XPath xPath = XPathFactory.newInstance().newXPath();         NodeList nodeList = (NodeList) xPath.evaluate("//text()[normalize-space()='']",                                                       document,                                                       XPathConstants.NODESET);          for (int i = 0; i < nodeList.getLength(); ++i) {             Node node = nodeList.item(i);             node.getParentNode().removeChild(node);         }          // Setup pretty print options         TransformerFactory transformerFactory = TransformerFactory.newInstance();         transformerFactory.setAttribute("indent-number", indent);         Transformer transformer = transformerFactory.newTransformer();         transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");         transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");         transformer.setOutputProperty(OutputKeys.INDENT, "yes");          // Return pretty print xml string         StringWriter stringWriter = new StringWriter();         transformer.transform(new DOMSource(document), new StreamResult(stringWriter));         return stringWriter.toString();     } catch (Exception e) {         throw new RuntimeException(e);     } } 

Sample usage

String xml = "<root>" + //              "\n   "  + //              "\n<name>Coco Puff</name>" + //              "\n        <total>10</total>    </root>";  System.out.println(toPrettyString(xml, 4)); 

Output

<root>     <name>Coco Puff</name>     <total>10</total> </root> 

References

  • Java: Properly Indenting XML String published on MyShittyCode
  • Save new XML node to file
like image 129
Stephan Avatar answered Oct 03 '22 12:10

Stephan