Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get nodes from xml files

Tags:

c#

xml

winforms

How to parse the xml file?

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> 
<sitemap> 
    <loc>link</loc>
    <lastmod>2011-08-17T08:23:17+00:00</lastmod> 
</sitemap> 
<sitemap>
    <loc>link</loc> 
    <lastmod>2011-08-18T08:23:17+00:00</lastmod> 
</sitemap> 
</sitemapindex>

I am new to XML, I tried this, but it seems to be not working :

        XmlDocument xml = new XmlDocument(); //* create an xml document object. 
        xml.Load("sitemap.xml");
        XmlNodeList xnList = xml.SelectNodes("/sitemapindex/sitemap");
        foreach (XmlNode xn in xnList)
        {
            String loc= xn["loc"].InnerText;
            String lastmod= xn["lastmod"].InnerText;
        }
like image 467
Andrew Avatar asked Jul 05 '12 16:07

Andrew


People also ask

How do I find nodes in XML?

To find nodes in an XML file you can use XPath expressions. Method XmlNode. SelectNodes returns a list of nodes selected by the XPath string.

What are nodes in XML file?

Everything in an XML document is a node. For example, the entire document is the document node, and every element is an element node. The topmost node of a tree. In the case of XML documents, it is always the document node, and not the top-most element.

How can we access the data from XML elements?

Retrieving information from XML files by using the Document Object Model, XmlReader class, XmlDocument class, and XmlNode class. Synchronizing DataSet data with XML via the XmlDataDocument class. Executing XML queries with XPath and the XPathNavigator class.

How many root nodes are there in XML?

While a properly formed XML file can only have a single root element, an XSD or DTD file can contain multiple roots.


1 Answers

The problem is that the sitemapindex element defines a default namespace. You need to specify the namespace when you select the nodes, otherwise it will not find them. For instance:

XmlDocument xml = new XmlDocument();
xml.Load("sitemap.xml");
XmlNamespaceManager manager = new XmlNamespaceManager(xml.NameTable);
manager.AddNamespace("s", "http://www.sitemaps.org/schemas/sitemap/0.9");
XmlNodeList xnList = xml.SelectNodes("/s:sitemapindex/s:sitemap", manager);

Normally speaking, when using the XmlNameSpaceManager, you could leave the prefix as an empty string to specify that you want that namespace to be the default namespace. So you would think you'd be able to do something like this:

// WON'T WORK
XmlDocument xml = new XmlDocument();
xml.Load("sitemap.xml");
XmlNamespaceManager manager = new XmlNamespaceManager(xml.NameTable);
manager.AddNamespace("", "http://www.sitemaps.org/schemas/sitemap/0.9"); //Empty prefix
XmlNodeList xnList = xml.SelectNodes("/sitemapindex/sitemap", manager); //No prefixes in XPath

However, if you try that code, you'll find that it won't find any matching nodes. The reason for this is that in XPath 1.0 (which is what XmlDocument implements), when no namespace is provided, it always uses the null namespace, not the default namespace. So, it doesn't matter if you specify a default namespace in the XmlNamespaceManager, it's not going to be used by XPath, anyway. To quote the relevant paragraph from the Official XPath Specification:

A QName in the node test is expanded into an expanded-name using the namespace declarations from the expression context. This is the same way expansion is done for element type names in start and end-tags except that the default namespace declared with xmlns is not used: if the QName does not have a prefix, then the namespace URI is null (this is the same way attribute names are expanded). It is an error if the QName has a prefix for which there is no namespace declaration in the expression context.

Therefore, when the elements you are reading belong to a namespace, you can't avoid putting the namespace prefix in your XPath statements. However, if you don't want to bother putting the namespace URI in your code, you can just use the XmlDocument object to return the URI of the root element, which in this case, is what you want. For instance:

XmlDocument xml = new XmlDocument();
xml.Load("sitemap.xml");
XmlNamespaceManager manager = new XmlNamespaceManager(xml.NameTable);
manager.AddNamespace("s", xml.DocumentElement.NamespaceURI); //Using xml's properties instead of hard-coded URI
XmlNodeList xnList = xml.SelectNodes("/s:sitemapindex/s:sitemap", manager);
like image 145
Steven Doggart Avatar answered Sep 20 '22 19:09

Steven Doggart