I'm writing unit tests for checking some XML builder.
Now I'm running into the problem of syntactical differences between the expected result and the actual result, despite their identical semantics.
Example:
Expected result:
<parent><child attr="test attribute">text here</child></parent>
Actual result:
<parent>
  <child attr="test attribute">
    text here
  </child>
</parent>
I tried normalizing the xml using XmlUtil.serialize(), however this seems to keep the whitespaces, leaving syntactical differences.
How can I get the normalized/canonical form of xml strings in order to make my tests more robust?
I'm writing a Grails application, so I'm fine with any solution in Groovy or Java.
The question and the accepted answer (as of today) correspond to a legacy version of XMLUnit.
For those interested in knowing how to do it with XMLUnit v2 on Groovy:
def "XMLs must be identical"() {
    setup:
    def control = '<foo><bar></bar></foo>'
    def test = '''
        <foo>
          <bar></bar>
        </foo>
    '''
    when:
    Diff d = DiffBuilder.compare(Input.fromString(control))
        .withTest(Input.fromString(test))
        .ignoreWhitespace()
        .ignoreComments()
        .normalizeWhitespace()
        .build()
    then:
    !d.hasDifferences()
}
Perhaps there is a "groovier" way of doing it but I think it's OK for illustration purposes :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With