When my XML looks like this (no xmlns
) then I can easly query it with XPath like /workbook/sheets/sheet[1]
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <workbook> <sheets> <sheet name="Sheet1" sheetId="1" r:id="rId1"/> </sheets> </workbook>
But when it looks like this then I can't
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"> <sheets> <sheet name="Sheet1" sheetId="1" r:id="rId1"/> </sheets> </workbook>
Any ideas?
XPath queries are aware of namespaces in an XML document and can use namespace prefixes to qualify element and attribute names. Qualifying element and attribute names with a namespace prefix limits the nodes returned by an XPath query to only those nodes that belong to a specific namespace.
Introduction to XPath namespace. In an XML document, namespaces are used to provide uniquely named components and attributes. A namespace is made up of two parts: a prefix and a URL. This indicates the location of a document that defines the namespace in question.
In the second example XML file the elements are bound to a namespace. Your XPath is attempting to address elements that are bound to the default "no namespace" namespace, so they don't match.
The preferred method is to register the namespace with a namespace-prefix. It makes your XPath much easier to develop, read, and maintain.
However, it is not mandatory that you register the namespace and use the namespace-prefix in your XPath.
You can formulate an XPath expression that uses a generic match for an element and a predicate filter that restricts the match for the desired local-name()
and the namespace-uri()
. For example:
/*[local-name()='workbook' and namespace-uri()='http://schemas.openxmlformats.org/spreadsheetml/2006/main'] /*[local-name()='sheets' and namespace-uri()='http://schemas.openxmlformats.org/spreadsheetml/2006/main'] /*[local-name()='sheet' and namespace-uri()='http://schemas.openxmlformats.org/spreadsheetml/2006/main'][1]
As you can see, it produces an extremely long and verbose XPath statement that is very difficult to read (and maintain).
You could also just match on the local-name()
of the element and ignore the namespace. For example:
/*[local-name()='workbook']/*[local-name()='sheets']/*[local-name()='sheet'][1]
However, you run the risk of matching the wrong elements. If your XML has mixed vocabularies (which may not be an issue for this instance) that use the same local-name()
, your XPath could match on the wrong elements and select the wrong content:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With