Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to compare 2 XML documents in Java

People also ask

How do I compare two pom XML files?

Copy and paste, drag and drop a XML file or directly type in the editors above, and then click on "Compare" button they will be compared if the two XML are valids. You can also click on "load XML from URL" button to load your XML data from a URL (Must be https).


Sounds like a job for XMLUnit

  • http://www.xmlunit.org/
  • https://github.com/xmlunit

Example:

public class SomeTest extends XMLTestCase {
  @Test
  public void test() {
    String xml1 = ...
    String xml2 = ...

    XMLUnit.setIgnoreWhitespace(true); // ignore whitespace differences

    // can also compare xml Documents, InputSources, Readers, Diffs
    assertXMLEqual(xml1, xml2);  // assertXMLEquals comes from XMLTestCase
  }
}

The following will check if the documents are equal using standard JDK libraries.

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
dbf.setCoalescing(true);
dbf.setIgnoringElementContentWhitespace(true);
dbf.setIgnoringComments(true);
DocumentBuilder db = dbf.newDocumentBuilder();

Document doc1 = db.parse(new File("file1.xml"));
doc1.normalizeDocument();

Document doc2 = db.parse(new File("file2.xml"));
doc2.normalizeDocument();

Assert.assertTrue(doc1.isEqualNode(doc2));

normalize() is there to make sure there are no cycles (there technically wouldn't be any)

The above code will require the white spaces to be the same within the elements though, because it preserves and evaluates it. The standard XML parser that comes with Java does not allow you to set a feature to provide a canonical version or understand xml:space if that is going to be a problem then you may need a replacement XML parser such as xerces or use JDOM.


Xom has a Canonicalizer utility which turns your DOMs into a regular form, which you can then stringify and compare. So regardless of whitespace irregularities or attribute ordering, you can get regular, predictable comparisons of your documents.

This works especially well in IDEs that have dedicated visual String comparators, like Eclipse. You get a visual representation of the semantic differences between the documents.


The latest version of XMLUnit can help the job of asserting two XML are equal. Also XMLUnit.setIgnoreWhitespace() and XMLUnit.setIgnoreAttributeOrder() may be necessary to the case in question.

See working code of a simple example of XML Unit use below.

import org.custommonkey.xmlunit.DetailedDiff;
import org.custommonkey.xmlunit.XMLUnit;
import org.junit.Assert;

public class TestXml {

    public static void main(String[] args) throws Exception {
        String result = "<abc             attr=\"value1\"                title=\"something\">            </abc>";
        // will be ok
        assertXMLEquals("<abc attr=\"value1\" title=\"something\"></abc>", result);
    }

    public static void assertXMLEquals(String expectedXML, String actualXML) throws Exception {
        XMLUnit.setIgnoreWhitespace(true);
        XMLUnit.setIgnoreAttributeOrder(true);

        DetailedDiff diff = new DetailedDiff(XMLUnit.compareXML(expectedXML, actualXML));

        List<?> allDifferences = diff.getAllDifferences();
        Assert.assertEquals("Differences found: "+ diff.toString(), 0, allDifferences.size());
    }

}

If using Maven, add this to your pom.xml:

<dependency>
    <groupId>xmlunit</groupId>
    <artifactId>xmlunit</artifactId>
    <version>1.4</version>
</dependency>

Building on Tom's answer, here's an example using XMLUnit v2.

It uses these maven dependencies

    <dependency>
        <groupId>org.xmlunit</groupId>
        <artifactId>xmlunit-core</artifactId>
        <version>2.0.0</version>
        <scope>test</scope>
    </dependency>
    <dependency>
        <groupId>org.xmlunit</groupId>
        <artifactId>xmlunit-matchers</artifactId>
        <version>2.0.0</version>
        <scope>test</scope>
    </dependency>

..and here's the test code

import static org.junit.Assert.assertThat;
import static org.xmlunit.matchers.CompareMatcher.isIdenticalTo;
import org.xmlunit.builder.Input;
import org.xmlunit.input.WhitespaceStrippedSource;

public class SomeTest extends XMLTestCase {
    @Test
    public void test() {
        String result = "<root></root>";
        String expected = "<root>  </root>";

        // ignore whitespace differences
        // https://github.com/xmlunit/user-guide/wiki/Providing-Input-to-XMLUnit#whitespacestrippedsource
        assertThat(result, isIdenticalTo(new WhitespaceStrippedSource(Input.from(expected).build())));

        assertThat(result, isIdenticalTo(Input.from(expected).build())); // will fail due to whitespace differences
    }
}

The documentation that outlines this is https://github.com/xmlunit/xmlunit#comparing-two-documents


Thanks, I extended this, try this ...

import java.io.ByteArrayInputStream;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;

import org.w3c.dom.Document;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;

public class XmlDiff 
{
    private boolean nodeTypeDiff = true;
    private boolean nodeValueDiff = true;

    public boolean diff( String xml1, String xml2, List<String> diffs ) throws Exception
    {
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        dbf.setNamespaceAware(true);
        dbf.setCoalescing(true);
        dbf.setIgnoringElementContentWhitespace(true);
        dbf.setIgnoringComments(true);
        DocumentBuilder db = dbf.newDocumentBuilder();


        Document doc1 = db.parse(new ByteArrayInputStream(xml1.getBytes()));
        Document doc2 = db.parse(new ByteArrayInputStream(xml2.getBytes()));

        doc1.normalizeDocument();
        doc2.normalizeDocument();

        return diff( doc1, doc2, diffs );

    }

    /**
     * Diff 2 nodes and put the diffs in the list 
     */
    public boolean diff( Node node1, Node node2, List<String> diffs ) throws Exception
    {
        if( diffNodeExists( node1, node2, diffs ) )
        {
            return true;
        }

        if( nodeTypeDiff )
        {
            diffNodeType(node1, node2, diffs );
        }

        if( nodeValueDiff )
        {
            diffNodeValue(node1, node2, diffs );
        }


        System.out.println(node1.getNodeName() + "/" + node2.getNodeName());

        diffAttributes( node1, node2, diffs );
        diffNodes( node1, node2, diffs );

        return diffs.size() > 0;
    }

    /**
     * Diff the nodes
     */
    public boolean diffNodes( Node node1, Node node2, List<String> diffs ) throws Exception
    {
        //Sort by Name
        Map<String,Node> children1 = new LinkedHashMap<String,Node>();      
        for( Node child1 = node1.getFirstChild(); child1 != null; child1 = child1.getNextSibling() )
        {
            children1.put( child1.getNodeName(), child1 );
        }

        //Sort by Name
        Map<String,Node> children2 = new LinkedHashMap<String,Node>();      
        for( Node child2 = node2.getFirstChild(); child2!= null; child2 = child2.getNextSibling() )
        {
            children2.put( child2.getNodeName(), child2 );
        }

        //Diff all the children1
        for( Node child1 : children1.values() )
        {
            Node child2 = children2.remove( child1.getNodeName() );
            diff( child1, child2, diffs );
        }

        //Diff all the children2 left over
        for( Node child2 : children2.values() )
        {
            Node child1 = children1.get( child2.getNodeName() );
            diff( child1, child2, diffs );
        }

        return diffs.size() > 0;
    }


    /**
     * Diff the nodes
     */
    public boolean diffAttributes( Node node1, Node node2, List<String> diffs ) throws Exception
    {        
        //Sort by Name
        NamedNodeMap nodeMap1 = node1.getAttributes();
        Map<String,Node> attributes1 = new LinkedHashMap<String,Node>();        
        for( int index = 0; nodeMap1 != null && index < nodeMap1.getLength(); index++ )
        {
            attributes1.put( nodeMap1.item(index).getNodeName(), nodeMap1.item(index) );
        }

        //Sort by Name
        NamedNodeMap nodeMap2 = node2.getAttributes();
        Map<String,Node> attributes2 = new LinkedHashMap<String,Node>();        
        for( int index = 0; nodeMap2 != null && index < nodeMap2.getLength(); index++ )
        {
            attributes2.put( nodeMap2.item(index).getNodeName(), nodeMap2.item(index) );

        }

        //Diff all the attributes1
        for( Node attribute1 : attributes1.values() )
        {
            Node attribute2 = attributes2.remove( attribute1.getNodeName() );
            diff( attribute1, attribute2, diffs );
        }

        //Diff all the attributes2 left over
        for( Node attribute2 : attributes2.values() )
        {
            Node attribute1 = attributes1.get( attribute2.getNodeName() );
            diff( attribute1, attribute2, diffs );
        }

        return diffs.size() > 0;
    }
    /**
     * Check that the nodes exist
     */
    public boolean diffNodeExists( Node node1, Node node2, List<String> diffs ) throws Exception
    {
        if( node1 == null && node2 == null )
        {
            diffs.add( getPath(node2) + ":node " + node1 + "!=" + node2 + "\n" );
            return true;
        }

        if( node1 == null && node2 != null )
        {
            diffs.add( getPath(node2) + ":node " + node1 + "!=" + node2.getNodeName() );
            return true;
        }

        if( node1 != null && node2 == null )
        {
            diffs.add( getPath(node1) + ":node " + node1.getNodeName() + "!=" + node2 );
            return true;
        }

        return false;
    }

    /**
     * Diff the Node Type
     */
    public boolean diffNodeType( Node node1, Node node2, List<String> diffs ) throws Exception
    {       
        if( node1.getNodeType() != node2.getNodeType() ) 
        {
            diffs.add( getPath(node1) + ":type " + node1.getNodeType() + "!=" + node2.getNodeType() );
            return true;
        }

        return false;
    }

    /**
     * Diff the Node Value
     */
    public boolean diffNodeValue( Node node1, Node node2, List<String> diffs ) throws Exception
    {       
        if( node1.getNodeValue() == null && node2.getNodeValue() == null )
        {
            return false;
        }

        if( node1.getNodeValue() == null && node2.getNodeValue() != null )
        {
            diffs.add( getPath(node1) + ":type " + node1 + "!=" + node2.getNodeValue() );
            return true;
        }

        if( node1.getNodeValue() != null && node2.getNodeValue() == null )
        {
            diffs.add( getPath(node1) + ":type " + node1.getNodeValue() + "!=" + node2 );
            return true;
        }

        if( !node1.getNodeValue().equals( node2.getNodeValue() ) )
        {
            diffs.add( getPath(node1) + ":type " + node1.getNodeValue() + "!=" + node2.getNodeValue() );
            return true;
        }

        return false;
    }


    /**
     * Get the node path
     */
    public String getPath( Node node )
    {
        StringBuilder path = new StringBuilder();

        do
        {           
            path.insert(0, node.getNodeName() );
            path.insert( 0, "/" );
        }
        while( ( node = node.getParentNode() ) != null );

        return path.toString();
    }
}

AssertJ 1.4+ has specific assertions to compare XML content:

String expectedXml = "<foo />";
String actualXml = "<bar />";
assertThat(actualXml).isXmlEqualTo(expectedXml);

Here is the Documentation