Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do only certain XPath expressions find nodes when xml has a namespace prefix

Tags:

java

xpath

In the example code below any XPath that are in the form '//elementName' return null when the source xml has a namespace prefix (see testWithNS() in the code at the bottom).

When the source xml does not have a namespace prefix all the listed XPath expressions return a node (see testNoNS()).

I know I could solve this by setting up a NamespaceContext (as in testWithNSContext()), parsing the xml as a namespace aware document, and using namespace prefixes in the XPaths. However I don't want to do this as my actual code needs to process xml both with and without namespace prefixes.

My question is why is it only:

  • //test
  • //child1
  • //grandchild1
  • //child2

that return null, yet all other examples in testWithNS() return the node?

Output

testNoNS()
test = found
/test = found
//test = found
//test/* = found
//test/child1 = found
//test/child1/grandchild1 = found
//test/child2 = found
//child1 = found
//grandchild1 = found
//child1/grandchild1 = found
//child2 = found

testWithNS()
test = found
/test = found
//test = *** NOT FOUND ***
//test/* = found
//test/child1 = found
//test/child1/grandchild1 = found
//test/child2 = found
//child1 = *** NOT FOUND ***
//grandchild1 = *** NOT FOUND ***
//child1/grandchild1 = found
//child2 = *** NOT FOUND ***

testWithNSContext()
ns1:test = found
/ns1:test = found
//ns1:test = found
//ns1:test/* = found
//ns1:test/ns1:child1 = found
//ns1:test/ns1:child1/ns1:grandchild1 = found
//ns1:test/ns1:child2 = found
//ns1:child1 = found
//ns1:grandchild1 = found
//ns1:child1/ns1:grandchild1 = found
//ns1:child2 = found

Code

import java.io.StringReader;
import java.util.Iterator;

import javax.xml.XMLConstants;
import javax.xml.namespace.NamespaceContext;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;

import org.junit.Test;
import org.w3c.dom.Document;
import org.xml.sax.InputSource;

public class XPathBugTest {

    private String xmlDec = "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>";
    private String xml = xmlDec + 
        "<test>" +
        "  <child1>" +
        "    <grandchild1/>" +
        "  </child1>" +
        "  <child2/>" +
        "</test>";
    private String xmlNs = xmlDec + 
        "<ns1:test xmlns:ns1=\"http://www.wfmc.org/2002/XPDL1.0\">" +
        "  <ns1:child1>" +
        "    <ns1:grandchild1/>" +
        "  </ns1:child1>" +
        "  <ns1:child2/>" +
        "</ns1:test>";

    final XPathFactory xpathFactory = XPathFactory.newInstance();
    final XPath xpath = xpathFactory.newXPath();

    @Test
    public void testNoNS() throws Exception {
        System.out.println("\ntestNoNS()");
        final Document doc = getDocument(xml);

        isFound("test", xpath.evaluate("test", doc, XPathConstants.NODE));
        isFound("/test", xpath.evaluate("/test", doc, XPathConstants.NODE));
        isFound("//test", xpath.evaluate("//test", doc, XPathConstants.NODE));
        isFound("//test/*", xpath.evaluate("//test/*", doc, XPathConstants.NODE));
        isFound("//test/child1", xpath.evaluate("//test/child1", doc, XPathConstants.NODE));
        isFound("//test/child1/grandchild1", xpath.evaluate("//test/child1/grandchild1", doc, XPathConstants.NODE));
        isFound("//test/child2", xpath.evaluate("//test/child2", doc, XPathConstants.NODE));
        isFound("//child1", xpath.evaluate("//child1", doc, XPathConstants.NODE));
        isFound("//grandchild1", xpath.evaluate("//grandchild1", doc, XPathConstants.NODE));
        isFound("//child1/grandchild1", xpath.evaluate("//child1/grandchild1", doc, XPathConstants.NODE));
        isFound("//child2", xpath.evaluate("//child2", doc, XPathConstants.NODE));
    }

    @Test
    public void testWithNS() throws Exception {
        System.out.println("\ntestWithNS()");
        final Document doc = getDocument(xmlNs);

        isFound("test", xpath.evaluate("test", doc, XPathConstants.NODE));
        isFound("/test", xpath.evaluate("/test", doc, XPathConstants.NODE));
        isFound("//test", xpath.evaluate("//test", doc, XPathConstants.NODE));
        isFound("//test/*", xpath.evaluate("//test/*", doc, XPathConstants.NODE));
        isFound("//test/child1", xpath.evaluate("//test/child1", doc, XPathConstants.NODE));
        isFound("//test/child1/grandchild1", xpath.evaluate("//test/child1/grandchild1", doc, XPathConstants.NODE));
        isFound("//test/child2", xpath.evaluate("//test/child2", doc, XPathConstants.NODE));
        isFound("//child1", xpath.evaluate("//child1", doc, XPathConstants.NODE));
        isFound("//grandchild1", xpath.evaluate("//grandchild1", doc, XPathConstants.NODE));
        isFound("//child1/grandchild1", xpath.evaluate("//child1/grandchild1", doc, XPathConstants.NODE));
        isFound("//child2", xpath.evaluate("//child2", doc, XPathConstants.NODE));
    }

    @Test
    public void testWithNSContext() throws Exception {
        System.out.println("\ntestWithNSContext()");
        final Document doc = getDocumentNS(xmlNs);

        xpath.setNamespaceContext(new MyNamespaceContext());

        isFound("ns1:test", xpath.evaluate("ns1:test", doc, XPathConstants.NODE));
        isFound("/ns1:test", xpath.evaluate("/ns1:test", doc, XPathConstants.NODE));
        isFound("//ns1:test", xpath.evaluate("//ns1:test", doc, XPathConstants.NODE));
        isFound("//ns1:test/*", xpath.evaluate("//ns1:test/*", doc, XPathConstants.NODE));
        isFound("//ns1:test/ns1:child1", xpath.evaluate("//ns1:test/ns1:child1", doc, XPathConstants.NODE));
        isFound("//ns1:test/ns1:child1/ns1:grandchild1", xpath.evaluate("//ns1:test/ns1:child1/ns1:grandchild1", doc, XPathConstants.NODE));
        isFound("//ns1:test/ns1:child2", xpath.evaluate("//ns1:test/ns1:child2", doc, XPathConstants.NODE));
        isFound("//ns1:child1", xpath.evaluate("//ns1:child1", doc, XPathConstants.NODE));
        isFound("//ns1:grandchild1", xpath.evaluate("//ns1:grandchild1", doc, XPathConstants.NODE));
        isFound("//ns1:child1/ns1:grandchild1", xpath.evaluate("//ns1:child1/ns1:grandchild1", doc, XPathConstants.NODE));
        isFound("//ns1:child2", xpath.evaluate("//ns1:child2", doc, XPathConstants.NODE));
    }

    private void isFound(String xpath, Object object) {
        System.out.println(xpath + " = " + (object == null ? "*** NOT FOUND ***" : "found"));
    }

    private Document getDocument(final String xml) throws Exception {
        final DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        return factory.newDocumentBuilder().parse(new InputSource(new StringReader(xml)));        
    }

    private Document getDocumentNS(final String xml) throws Exception {
        final DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setNamespaceAware(true);
        return factory.newDocumentBuilder().parse(new InputSource(new StringReader(xml)));
    }

    public class MyNamespaceContext implements NamespaceContext {
        @Override
        public String getNamespaceURI(String prefix) {
            if ("ns1".equals(prefix)) {
                return "http://www.wfmc.org/2002/XPDL1.0";
            }
            return XMLConstants.NULL_NS_URI;
        }
        @Override
        public String getPrefix(String uri) {
            throw new UnsupportedOperationException();
        }
        @Override
        public Iterator getPrefixes(String uri) {
            throw new UnsupportedOperationException();
        }
    }
}

Update following Saxon test

I have now tested the same code using Saxon changing the XPahtFactory line to this

final XPathFactory xpathFactory = new net.sf.saxon.xpath.XPathFactoryImpl();

Using Saxon all lines in testWithNS() return *** NOT FOUND *** rather than just the ones like '//elementName' as with the default Xalan implementation.

Given that I'm using a non namespace aware document builder factory to parse the xml, why do none of these xpaths work, and only some with Xalan?

like image 694
Chris R Avatar asked Jul 24 '13 13:07

Chris R


People also ask

What is namespace node in XPath?

XPath queries are aware of namespaces in an XML document and can use namespace prefixes to qualify element and attribute names. Qualifying element and attribute names with a namespace prefix limits the nodes returned by an XPath query to only those nodes that belong to a specific namespace.

Which XML attribute allows us to specify XPath expressions?

xsl". The XSL file uses the XPath expressions under select attribute of various XSL tags to fetchvalues of id, firstname, lastname, nickname andsalary of each employee node.

How do I add namespace prefix to XML element?

When using prefixes in XML, a namespace for the prefix must be defined. The namespace can be defined by an xmlns attribute in the start tag of an element. The namespace declaration has the following syntax. xmlns:prefix="URI".


1 Answers

If you want to ignore namespaces, you can use the local-name XPath function:

//*[local-name()='grandchild1']
like image 53
choroba Avatar answered Nov 03 '22 20:11

choroba