Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to query XML using namespaces in Java with XPath?

When my XML looks like this (no xmlns) then I can easly query it with XPath like /workbook/sheets/sheet[1]

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <workbook>   <sheets>     <sheet name="Sheet1" sheetId="1" r:id="rId1"/>   </sheets> </workbook> 

But when it looks like this then I can't

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">   <sheets>     <sheet name="Sheet1" sheetId="1" r:id="rId1"/>   </sheets> </workbook> 

Any ideas?

like image 922
Inez Avatar asked Jun 17 '11 18:06

Inez


People also ask

How does XPath handle namespace?

XPath queries are aware of namespaces in an XML document and can use namespace prefixes to qualify element and attribute names. Qualifying element and attribute names with a namespace prefix limits the nodes returned by an XPath query to only those nodes that belong to a specific namespace.

What are namespace nodes in XPath?

Introduction to XPath namespace. In an XML document, namespaces are used to provide uniquely named components and attributes. A namespace is made up of two parts: a prefix and a URL. This indicates the location of a document that defines the namespace in question.


1 Answers

In the second example XML file the elements are bound to a namespace. Your XPath is attempting to address elements that are bound to the default "no namespace" namespace, so they don't match.

The preferred method is to register the namespace with a namespace-prefix. It makes your XPath much easier to develop, read, and maintain.

However, it is not mandatory that you register the namespace and use the namespace-prefix in your XPath.

You can formulate an XPath expression that uses a generic match for an element and a predicate filter that restricts the match for the desired local-name() and the namespace-uri(). For example:

/*[local-name()='workbook'     and namespace-uri()='http://schemas.openxmlformats.org/spreadsheetml/2006/main']   /*[local-name()='sheets'       and namespace-uri()='http://schemas.openxmlformats.org/spreadsheetml/2006/main']   /*[local-name()='sheet'       and namespace-uri()='http://schemas.openxmlformats.org/spreadsheetml/2006/main'][1] 

As you can see, it produces an extremely long and verbose XPath statement that is very difficult to read (and maintain).

You could also just match on the local-name() of the element and ignore the namespace. For example:

/*[local-name()='workbook']/*[local-name()='sheets']/*[local-name()='sheet'][1] 

However, you run the risk of matching the wrong elements. If your XML has mixed vocabularies (which may not be an issue for this instance) that use the same local-name(), your XPath could match on the wrong elements and select the wrong content:

like image 143
Mads Hansen Avatar answered Sep 29 '22 05:09

Mads Hansen