Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does NamespaceManager not use DefaultNamespace when using no prefix in XPath

When I want to traverse my XmlDocument using XPath, I came unto the problem that there were many ugly namespaces in the document, so I started using a NamespaceManager along with the XPath.

The XML looks like this

<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:o="urn:schemas-microsoft-com:office:office"
 xmlns:x="urn:schemas-microsoft-com:office:excel"
 xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:html="http://www.w3.org/TR/REC-html40">
    <Worksheet ss:Name="KA0100401">
        <Table>
            <Row>
                <Cell>Data</Cell>
            </Row>
            <!-- more rows... -->
        </Table>
    </Worksheet>
    <Worksheet ss:Name="KA0100402">
        <!-- .... --->
    </Worksheet>
</Workbook>

Now, from what I see from this document, "urn:schemas-microsoft-com:office:spreadsheet" is the default namespace, because it sits on the root element.

So, naively, I configured my NamespaceManager like this:

XmlDocument document = new XmlDocument();
document.Load(reader);
XmlNamespaceManager manager = new XmlNamespaceManager(document.NameTable);
manager.AddNamespace(String.Empty, "urn:schemas-microsoft-com:office:spreadsheet");
manager.AddNamespace("o", "urn:schemas-microsoft-com:office:office");
manager.AddNamespace("x", "urn:schemas-microsoft-com:office:excel");
manager.AddNamespace("ss", "urn:schemas-microsoft-com:office:spreadsheet");
manager.AddNamespace("html", "http://www.w3.org/TR/REC-html40");

But, when I try to access a node

foreach (XmlNode row in document.SelectNodes("/Workbook/Worksheet[1]/Table/Row", manager))

I never get any results. I was under the impression that by setting the first namespace with an empty prefix, I wouldn't need to set that when searching for nodes in that workspace.

But, as it is stated on the AddNamespace method:

If an XPath expression does not include a prefix, it is assumed that the namespace Uniform Resource Identifier (URI) is the empty namespace.

Why is that? And, more important: How do I access nodes in the default namespace, if not using a prefix sets them into an empty namespace?

What good is setting the default namespace on the manager if I can't even access it when searching for nodes?

like image 572
F.P Avatar asked Oct 29 '14 12:10

F.P


People also ask

What is XPath default namespace?

The Default NamespaceXPath treats the empty prefix as the null namespace. In other words, only prefixes mapped to namespaces can be used in XPath queries. This means that if you want to query against a namespace in an XML document, even if it is the default namespace, you need to define a prefix for it.

How does XPath handle namespace?

XPath queries are aware of namespaces in an XML document and can use namespace prefixes to qualify element and attribute names. Qualifying element and attribute names with a namespace prefix limits the nodes returned by an XPath query to only those nodes that belong to a specific namespace.

How to specify namespace in XPath expression?

Within RUEI, all namespaces used in your XPath queries must be explicitly defined. If a namespace is used in a query, but is not defined, it will not work. To define a namespace, do the following: Select Configuration, then General, Advanced settings, and then XPath namespaces.

Why is namespace important in XML?

One of the primary motivations for defining an XML namespace is to avoid naming conflicts when using and re-using multiple vocabularies. XML Schema is used to create a vocabulary for an XML instance, and uses namespaces heavily.


2 Answers

From the XPath 1.0 spec:

A QName in the node test is expanded into an expanded-name using the namespace declarations from the expression context. This is the same way expansion is done for element type names in start and end-tags except that the default namespace declared with xmlns is not used: if the QName does not have a prefix, then the namespace URI is null (this is the same way attribute names are expanded). It is an error if the QName has a prefix for which there is no namespace declaration in the expression context.

So this is not a matter regarding NamespaceManager but rather the way XPath is defined to work.


The point that you're missing is that the prefixes you use in your NamespaceManager don't have to be anything like the ones in your XML document. You can use the xcel prefix for urn:schemas-microsoft-com:office:excel if you want, and the sp prefix for urn:schemas-microsoft-com:office:spreadsheet. In fact, you're already assigning a prefix for that URN in your namespace manager, so you can just use that:

foreach (XmlNode row in 
       document.SelectNodes("/ss:Workbook/ss:Worksheet[1]/ss:Table/ss:Row", manager))


Regarding this question:

What good is setting the default namespace on the manager if I can't even access it when searching for nodes?

The good is that XmlNamespaceManager is used for more than just evaluating XPath. For example, it could be used to keep track of the namespaces in an XML document, in which there is a concept of default namespaces.

like image 188
JLRishe Avatar answered Sep 21 '22 12:09

JLRishe


@JLRishe's answer is correct for accessing nodes in the default namespace (ie. always mapping a prefix to the default namespace in the XmlNamespaceManager).

Reading the entire context of the link from your quote (MSDN XmlNamespaceManager.AddNamespace) it is stated that the default "empty" prefix is not used in XPath expressions.

prefix Type: System.String

The prefix to associate with the namespace being added. Use String.Empty to add a default namespace.>

Note If the XmlNamespaceManager will be used for resolving namespaces in an XML Path Language (XPath) expression, a prefix must be specified. If an XPath expression does not include a prefix, it is assumed that the namespace Uniform Resource Identifier (URI) is the empty namespace. For more information about XPath expressions and the XmlNamespaceManager, refer to the XmlNode.SelectNodes and XPathExpression.SetContext methods.

like image 29
Matt Avatar answered Sep 21 '22 12:09

Matt