Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XML namespaces and XPath

Tags:

c#

xml

xpath

I have an application that has to load XML document and output nodes depending on XPath.

Suppose I start with a document like this:

<aaa>
  ...[many nodes here]...
  <bbb>text</bbb>
  ...[many nodes here]...
  <bbb>text</bbb>
  ...[many nodes here]...
</aaa>

With XPath //bbb

So far everything is nice.

And selection doc.SelectNodes("//bbb"); returns the list of required nodes.

Then someone uploads a document with one node like <myfancynamespace:foo/> and extra namespace in the root tag, and everything breaks.

Why? //bbb does not give a damn about myfancynamespace, theoretically it should even be good with //myfancynamespace:foo, as there is no ambiguity, but the expression returns 0 results and that's it.

Is there a workaround for this behavior?

I do have a namespace manager for the document, and I am passing it to the Xpath query. But the namespaces and the prefixes are unknown to me, so I can't add them before the query.

Do I have to pre-parse the document to fill the namespace manager before I do any selections? Why on earth such behavior, it just doesn't make sense.

EDIT:

I'm using: XmlDocument and XmlNamespaceManager

EDIT2:

XmlDocument doc = new XmlDocument();
doc.XmlResolver = null;
XmlNamespaceManager nsmgr = new XmlNamespaceManager(doc.NameTable);
//I wish I could:
//nsmgr.AddNamespace("magic", "http://magicnamespaceuri/
//...
doc.LoadXML(usersuppliedxml);
XmlNodeList nodes = doc.SelectNodes(usersuppliedxpath, nsmgr);//usersuppliedxpath -> "//bbb"

//nodes.Count should be > 0, but with namespaced document they are 0

EDIT3: Found an article which describes the actual scenario of the issue with one workaround, but not very pretty workaround: http://codeclimber.net.nz/archive/2008/01/09/How-to-query-a-XPath-doc-that-has-a-default.aspx

Almost seems that stripping the xmlns is the way to go...

like image 590
Coder Avatar asked Apr 18 '11 12:04

Coder


People also ask

What is difference between XPath and XML?

XPath is a query language for selecting nodes from an XML. An XML parser is a program that reads your XML and produces some kind of data structure, usually a Document Object Model (DOM) that you can programmatically manipulate in your programming language (java, perl, etc.).

Can we use XML in XPath?

XPath uses path expressions to select nodes or node-sets in an XML document. These path expressions look very much like the expressions you see when you work with a traditional computer file system. XPath expressions can be used in JavaScript, Java, XML Schema, PHP, Python, C and C++, and lots of other languages.

What is the use of XPath in XML?

The XML Path Language (XPath) is used to uniquely identify or address parts of an XML document. An XPath expression can be used to search through an XML document, and extract information from any part of the document, such as an element or attribute (referred to as a node in XML) in it.

What is XML namespace used for?

An XML namespace is a collection of names that can be used as element or attribute names in an XML document. The namespace qualifies element names uniquely on the Web in order to avoid conflicts between elements with the same name.


1 Answers

You're missing the whole point of XML namespaces.

But if you really need to perform XPath on documents that will use an unknown namespace, and you really don't care about it, you will need to strip it out and reload the document. XPath will not work in a namespace-agnostic way, unless you want to use the local-name() function at every point in your selectors.

private XmlDocument StripNamespace(XmlDocument doc)
{
    if (doc.DocumentElement.NamespaceURI.Length > 0)
    {
        doc.DocumentElement.SetAttribute("xmlns", "");
        // must serialize and reload for this to take effect
        XmlDocument newDoc = new XmlDocument();
        newDoc.LoadXml(doc.OuterXml);
        return newDoc;
    }
    else
    {
        return doc;
    }
}
like image 98
harpo Avatar answered Oct 02 '22 07:10

harpo