Merging Documents while preserving xsi:type

Tags:

I have 2 Document objects with documents that contain similiar XML's. For example:

<tt:root xmlns:tt="http://myurl.com/">
  <tt:child/>
  <tt:child/>
</tt:root>

And the other one:

<ns1:root xmlns:ns1="http://myurl.com/" xmlns:ns2="http://myotherurl.com/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <ns1:child/>
  <ns1:child xsi:type="ns2:SomeType"/>
</ns1:root>

I need to merge them to 1 document with 1 root element and 4 child elements. Problem is, if I use document.importNode function to do the merging, it properly handles the namespaces everywhere BUT xsi:type element. So what I'm getting in result is this:

<tt:root xmlns:tt="http://myurl.com/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <tt:child/>
  <tt:child/>
  <ns1:child xmlns:ns1="http://myurl.com/"/>
  <ns1:child xmlns:ns1="http://myurl.com/" xsi:type="ns2:SomeType"/>
</tt:root>

As you can see, ns2 is used in xsi:type but is not defined anywhere. Is there any automated way to solve this problem?

Thanks.

ADDED:

If this task is impossible to complete using the default java DOM libraries, maybe there is some other library I can use to complete my task?

607

asked Jun 01 '11 08:06

bezmax

5 Answers

If I fix up the Namespace problem in your second file (by binding the "xsi" prefix), and do the merge using the code below the namespace bindings are preserved on the output; or at least they are here (vanilla Java 64-bit on Windows build 1.6.0_24).

String s1 = "<!-- 1st XML document here -->";
String s2 = "<!-- 2nd XML document here -->";

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware( true );
DocumentBuilder builder = factory.newDocumentBuilder();

Document doc1 = builder.parse( new ByteArrayInputStream( s1.getBytes() ) );
Document doc2 = builder.parse( new ByteArrayInputStream( s2.getBytes() ) );

Element doc1root = ( Element )doc1.getDocumentElement();
Element doc2root = ( Element )doc2.getDocumentElement();

NamedNodeMap atts1 = doc1root.getAttributes();
NamedNodeMap atts2 = doc2root.getAttributes();

for( int i = 0; i < atts1.getLength(); i++ )
{
    String name = atts1.item( i ).getNodeName();
    if( name.startsWith( "xmlns:" ) )
    {
        if( atts2.getNamedItem( name ) == null )
        {
            doc2root.setAttribute( name, atts1.item( i ).getNodeValue() );
        }    
    }    
}

NodeList nl = doc1.getDocumentElement().getChildNodes();
for( int i = 0; i < nl.getLength(); i++ )
{
    Node n = nl.item( i );
    doc2root.appendChild( doc2.importNode( n, true ) );

}

TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
StreamResult streamResult = new StreamResult( System.out );
transformer.transform( new DOMSource( doc2 ), streamResult );

130

answered Nov 15 '22 02:11

alexbrn

The problem here is the use of namespace prefixes in attribute values; something that was never considered when the namespace standard was created, and something that the common Java DOM/XML tools cannot easily handle. However, you could solve it by

Before merging, replace every instance of xsi:type="prefix:value" with xsi:type="{namespace}value". By doing this, you are not dependent on the prefix mapping. In your example, <xsi:type="ns2:SomeType" would become xsi:type="{http://myotherurl.com/}SomeType".
Merge the documents.
On the result document, reverse the replacement in step 1. The prefix mappings have to be carefully managed to avoid collisions; possibly a new mapping has to be created.

answered Nov 15 '22 03:11

forty-two

A single-line of XQuery could do the job: construct a new node named as the context root element, then import its children together with those from the other document:

declare variable $other external; element {node-name(*)} {*/*, $other/*/*}

Though in XQuery you don't have full control over namespace nodes (at least in XQuery 1.0), it has a copy-namespaces mode setting that can be used to ask for keeping the namespace context intact, in case the implementation does preserve it by default.

If XQuery is a viable option, then saxon9he.jar could be the "magic xml library" that you are after.

Here is sample code exposing some context, using the s9api API:

import javax.xml.parsers.DocumentBuilderFactory;
import net.sf.saxon.s9api.*;
import org.w3c.dom.Document;

...

  Document merge(Document context, Document other) throws Exception
  {
    Processor processor = new Processor(false);
    XQueryExecutable executable = processor.newXQueryCompiler().compile(
      "declare variable $other external; element {node-name(*)} {*/*, $other/*/*}");
    XQueryEvaluator evaluator = executable.load();    
    DocumentBuilder db = processor.newDocumentBuilder();
    evaluator.setContextItem(db.wrap(context));
    evaluator.setExternalVariable(new QName("other"), db.wrap(other));
    Document doc =
      DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument();
    processor.writeXdmValue(evaluator.evaluate(), new DOMDestination(doc));
    return doc;
  }

answered Nov 15 '22 04:11

Gunther

I would take JAXB and the Mergeable plugin to generate mergeFrom methods in schema-derived classes. Then:

Unmarshal o1, o2
Marge o1, o2 using the generated methods into o3
Marshal o3

JAXB normally handles xsi:type quite allright.

answered Nov 15 '22 03:11

lexicore

UPDATE

This will not work for the case where the two documents has colliding namespace prefixes (the mapping from the second document will replace the mapping from from the first).

You could copy the namespace declarations from the second document to the imported nodes. Since child nodes can override a parent nodes prefix this is valid:

<foo:root xmlns:foo="urn:ROOT">
    <foo:child xmlns:foo="urn:CHILD" xsi:type="foo:child-type">
       ...
    </foo:child>
</foo:root>

In the above XML the namespace bound to the prefix "foo" is overridden in the scope of the child element. You can accomplish this for your use case by doing the following:

import java.io.File;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.Attr;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;

public class Demo {

    public static void main(String[] args) throws Exception  {
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        dbf.setNamespaceAware(true);
        DocumentBuilder db = dbf.newDocumentBuilder();

        File file1 = new File("src/forum231/input1.xml");
        Document doc1 = db.parse(file1);
        Element rootElement1 = doc1.getDocumentElement();

        File file2 = new File("src/forum231/input2.xml");
        Document doc2 = db.parse(file2);
        Element rootElement2 = doc2.getDocumentElement();

        // Copy Child Nodes
        NodeList childNodes2 = rootElement2.getChildNodes();
        for(int x=0; x<childNodes2.getLength(); x++) {
            Node importedNode = doc1.importNode(childNodes2.item(x), true);
            if(importedNode.getNodeType() == Node.ELEMENT_NODE) {
                Element importedElement = (Element) importedNode;
                // Copy Attributes
                NamedNodeMap namedNodeMap2 = rootElement2.getAttributes();
                for(int y=0; y<namedNodeMap2.getLength(); y++) {
                    Attr importedAttr = (Attr) doc1.importNode(namedNodeMap2.item(y), true);
                    importedElement.setAttributeNodeNS(importedAttr);
                }
            }
            rootElement1.appendChild(importedNode);
        }

        // Output Document
        TransformerFactory tf = TransformerFactory.newInstance();
        Transformer t = tf.newTransformer();
        DOMSource source = new DOMSource(doc1);
        StreamResult result = new StreamResult(System.out);
        t.transform(source, result);
    }

}

Output

<?xml version="1.0" encoding="UTF-8" standalone="no"?><tt:root xmlns:tt="http://myurl.com/">
  <tt:child/>
  <tt:child/>

  <ns1:child xmlns:ns1="http://myurl.com/" xmlns:ns2="http://myotherurl.com/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"/>
  <ns1:child xmlns:ns1="http://myurl.com/" xmlns:ns2="http://myotherurl.com/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="ns2:SomeType"/>
</tt:root>

ORIGINAL ANSWER

In addition to copying the elements, you could copy the attributes. This will ensure that the resulting document contains the necessary namespace declarations:

import java.io.File;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.Attr;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;

public class Demo {

    public static void main(String[] args) throws Exception  {
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        dbf.setNamespaceAware(true);
        DocumentBuilder db = dbf.newDocumentBuilder();

        File file1 = new File("input1.xml");
        Document doc1 = db.parse(file1);
        Element rootElement1 = doc1.getDocumentElement();

        File file2 = new File("input2.xml");
        Document doc2 = db.parse(file2);
        Element rootElement2 = doc2.getDocumentElement();

        // Copy Attributes
        NamedNodeMap namedNodeMap2 = rootElement2.getAttributes();
        for(int x=0; x<namedNodeMap2.getLength(); x++) {
            Attr importedNode = (Attr) doc1.importNode(namedNodeMap2.item(x), true);
            rootElement1.setAttributeNodeNS(importedNode);
        }

        // Copy Child Nodes
        NodeList childNodes2 = rootElement2.getChildNodes();
        for(int x=0; x<childNodes2.getLength(); x++) {
            Node importedNode = doc1.importNode(childNodes2.item(x), true);
            rootElement1.appendChild(importedNode);
        }

        // Output Document
        TransformerFactory tf = TransformerFactory.newInstance();
        Transformer t = tf.newTransformer();
        DOMSource source = new DOMSource(doc1);
        StreamResult result = new StreamResult(System.out);
        t.transform(source, result);
    }

}

Output:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<tt:root xmlns:tt="http://myurl.com/" xmlns:ns1="http://myurl.com/" xmlns:ns2="http://myotherurl.com/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <tt:child/>
  <tt:child/>

  <ns1:child/>
  <ns1:child xsi:type="ns2:SomeType"/>
</tt:root>

answered Nov 15 '22 02:11

bdoughan

Related questions
                            
                                Closing streams/sockets and try-catch-finally
                            
                                Limit method calls per second(s) (refuse when limit reached)
                            
                                Best way to execute repetitive Spring controller code?
                            
                                Where to store expected output of a test?
                            
                                How to use Enum.valueOf from Scala?
                            
                                How to represent static struct in Java
                            
                                How can I implement "recycle bin" functionality?
                            
                                Does anyone know of a Java Comparators library?
                            
                                How do I execute .jar java program on Windows 7 command line?
                            
                                reduce number of opened files in java code
                            
                                How to know method is JITed or interpreted
                            
                                How to draw a separator across a panel using MigLayout
                            
                                Java Servlet container performance?
                            
                                Sending dynamically generated javascript file
                            
                                Why would we use custom scope in spring? When is it needed?
                            
                                How to get an Exception source object
                            
                                Android Google Maps API not working with debug.keystore
                            
                                detect circular reference in an object
                            
                                Fill strings with 0's using formatter
                            
                                Treat a java.lang.Iterable as a #list expression in Freemarker

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With