Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I retrieve element text inside CDATA markup via XPath?

Tags:

xpath

Consider the following xml fragment:

<Obj>    <Name><![CDATA[SomeText]]></Name> </Obj> 

How do I retrieve the "SomeText" value via XPath? I'm using Nauman Leghari's (excellent) Visual XPath tool.
/Obj/Name returns the element
/Obj/Name/text() returns blank

I don't think its a problem with the tool (I may be wrong) - I also read XPath can't extract CDATA (See last response in this thread) - which sounds kinda weird to me.

like image 406
Gishu Avatar asked Feb 20 '09 04:02

Gishu


People also ask

Can we use XML in XPath?

XPath Standard Functions Today XPath expressions can also be used in JavaScript, Java, XML Schema, PHP, Python, C and C++, and lots of other languages.

How does XML XPath work?

In general, an XPath expression specifies a pattern that selects a set of XML nodes. XSLT templates then use those patterns when applying transformations. (XPointer, on the other hand, adds mechanisms for defining a point or a range so that XPath expressions can be used for addressing).

What is XPath text ()?

XPath text() function is a built-in function of the Selenium web driver that locates items based on their text. It aids in the identification of certain text elements as well as the location of those components within a set of text nodes.


1 Answers

/Obj/Name/text() is the XPath to return the content of the CDATA markup.

What threw me off was the behavior of the Value property. For an XMLNode (DOM world), the XmlNode.Value property of an Element (with CDATA or otherwise) returns Null. The InnerText property would give you the CDATA/Text content. If you use Xml.Linq, XElement.Value returns the CDATA content.

string sXml = @" <object>     <name><![CDATA[SomeText]]></name>     <name>OtherName</name> </object>";  XmlDocument xmlDoc = new XmlDocument(); xmlDoc.LoadXml( sXml ); XmlNamespaceManager nsMgr = new XmlNamespaceManager(xmlDoc.NameTable);  Console.WriteLine(@"XPath = /object/name" ); WriteNodesToConsole(xmlDoc.SelectNodes("/object/name", nsMgr));  Console.WriteLine(@"XPath = /object/name/text()" ); WriteNodesToConsole( xmlDoc.SelectNodes("/object/name/text()", nsMgr) );  Console.WriteLine(@"Xml.Linq = obRoot.Elements(""name"")"); XElement obRoot = XElement.Parse( sXml ); WriteNodesToConsole( obRoot.Elements("name") ); 

Output:

XPath = /object/name         NodeType = Element         Value = <null>         OuterXml = <name><![CDATA[SomeText]]></name>         InnerXml = <![CDATA[SomeText]]>         InnerText = SomeText          NodeType = Element         Value = <null>         OuterXml = <name>OtherName</name>         InnerXml = OtherName         InnerText = OtherName  XPath = /object/name/text()         NodeType = CDATA         Value = SomeText         OuterXml = <![CDATA[SomeText]]>         InnerXml =         InnerText = SomeText          NodeType = Text         Value = OtherName         OuterXml = OtherName         InnerXml =         InnerText = OtherName  Xml.Linq = obRoot.Elements("name")         Value = SomeText         Value = OtherName 

Turned out the author of Visual XPath had a TODO for the CDATA type of XmlNodes. A little code snippet and I have CDATA support now. alt text

MainForm.cs

private void Xml2Tree( TreeNode tNode, XmlNode xNode) {    ...    case XmlNodeType.CDATA:       //MessageBox.Show("TODO: XmlNodeType.CDATA");       // Gishu                           TreeNode cdataNode = new TreeNode("![CDATA[" + xNode.Value + "]]");       cdataNode.ForeColor = Color.Blue;       cdataNode.NodeFont = new Font("Tahoma", 12);       tNode.Nodes.Add(cdataNode);       //Gishu       break; 
like image 173
Gishu Avatar answered Sep 28 '22 16:09

Gishu