I'm having trouble understanding the difference between <code>text()</code> and <code>node()</code>. From what I understand, <code>text()</code> would be whatever is in between the tags <code><item>apple</item></code> which is apple in this case. Node would be whatever that node actually is, which would be item But then I've been assigned some work where it asks me to "Select the text of all items under produce" and a separate question asks "Select all the manager nodes in all departments" How is the output suppose to look <code>text()</code> as opposed to <code>node()</code> Snippet of XML: <pre class="prettyprint"><code><produce> <item>apple</item> <item>banana</item> <item>pepper</item> </produce> <department> <phone>123-456-7891</phone> <manager>John</manager> </department> </code></pre> Of course, there are more departments and more managers, but this was just a snippet of code. Any help would be much appreciated!

<code>text()</code> and <code>node()</code> are node tests, in XPath terminology (compare). Node tests operate on a set (on an axis, to be exact) of nodes and return the ones that are of a certain type. When no axis is mentioned, the <code>child</code> axis is assumed by default. There are all kinds of node tests: <ul> <li> <code>node()</code> matches any node (the least specific node test of them all)</li> <li> <code>text()</code> matches text nodes only</li> <li> <code>comment()</code> matches comment nodes</li> <li> <code>*</code> matches any element node</li> <li> <code>foo</code> matches any element node named <code>"foo"</code> </li> <li> <code>processing-instruction()</code> matches PI nodes (they look like <code><?name value?></code>).</li> <li> Side note: The <code>*</code> also matches attribute nodes, but only along the <code>attribute</code> axis. <code>@*</code> is a shorthand for <code>attribute::*</code>. Attributes are not part of the <code>child</code> axis, that's why a normal <code>*</code> does not select them.</li> </ul> This XML document: <pre class="prettyprint"><code><produce> <item>apple</item> <item>banana</item> <item>pepper</item> </produce> </code></pre> represents the following DOM (simplified): <pre class="prettyprint"> root node element node (name="produce") text node (value="\n ") element node (name="item") text node (value="apple") text node (value="\n ") element node (name="item") text node (value="banana") text node (value="\n ") element node (name="item") text node (value="pepper") text node (value="\n") </pre> So with XPath: <ul> <li> <code>/</code> selects the root node</li> <li> <code>/produce</code> selects a child element of the root node if it has the name <code>"produce"</code> (This is called the document element; it represents the document itself. Document element and root node are often confused, but they are not the same thing.) </li> <li> <code>/produce/node()</code> selects any type of child node beneath <code>/produce/</code> (i.e. all 7 children)</li> <li> <code>/produce/text()</code> selects the 4 (!) whitespace-only text nodes</li> <li> <code>/produce/item[1]</code> selects the first child element named <code>"item"</code> </li> <li> <code>/produce/item[1]/text()</code> selects all child text nodes (there's only one - "apple" - in this case)</li> </ul> And so on. So, your questions <ul> <li> "Select the text of all items under produce" <code>/produce/item/text()</code> (3 nodes selected)</li> <li> "Select all the manager nodes in all departments" <code>//department/manager</code> (1 node selected)</li> </ul> Notes <ul> <li>The default axis in XPath is the <code>child</code> axis. You can change the axis by prefixing a different axis name. For example: <code>//item/ancestor::produce</code> </li> <li>Element nodes have text values. When you evaluate an element node, its textual contents will be returned. In case of this example, <code>/produce/item[1]/text()</code> and <code>string(/produce/item[1])</code> will be the same.</li> <li>Also see this answer where I outline the individual parts of an XPath expression graphically.</li> </ul>

XPath - Difference between node() and text()

Tags:

xml

expression

xpath

I'm having trouble understanding the difference between text() and node(). From what I understand, text() would be whatever is in between the tags <item>apple</item> which is apple in this case. Node would be whatever that node actually is, which would be item

But then I've been assigned some work where it asks me to "Select the text of all items under produce" and a separate question asks "Select all the manager nodes in all departments"

How is the output suppose to look text() as opposed to node()

Snippet of XML:

<produce>  <item>apple</item>  <item>banana</item>  <item>pepper</item> </produce>  <department>  <phone>123-456-7891</phone>  <manager>John</manager> </department>

Of course, there are more departments and more managers, but this was just a snippet of code.

Any help would be much appreciated!

252

asked Jul 31 '12 16:07

Pztar

1 Answers

text() and node() are node tests, in XPath terminology (compare).

Node tests operate on a set (on an axis, to be exact) of nodes and return the ones that are of a certain type. When no axis is mentioned, the child axis is assumed by default.

There are all kinds of node tests:

node() matches any node (the least specific node test of them all)
text() matches text nodes only
comment() matches comment nodes
* matches any element node
foo matches any element node named "foo"
processing-instruction() matches PI nodes (they look like <?name value?>).
Side note: The * also matches attribute nodes, but only along the attribute axis. @* is a shorthand for attribute::*. Attributes are not part of the child axis, that's why a normal * does not select them.

This XML document:

<produce>     <item>apple</item>     <item>banana</item>     <item>pepper</item> </produce>

represents the following DOM (simplified):

 root node    element node (name="produce")       text node (value="\n    ")       element node (name="item")          text node (value="apple")       text node (value="\n    ")       element node (name="item")          text node (value="banana")       text node (value="\n    ")       element node (name="item")          text node (value="pepper")       text node (value="\n")

So with XPath:

/ selects the root node
/produce selects a child element of the root node if it has the name "produce" (This is called the document element; it represents the document itself. Document element and root node are often confused, but they are not the same thing.)
/produce/node() selects any type of child node beneath /produce/ (i.e. all 7 children)
/produce/text() selects the 4 (!) whitespace-only text nodes
/produce/item[1] selects the first child element named "item"
/produce/item[1]/text() selects all child text nodes (there's only one - "apple" - in this case)

And so on.

So, your questions

"Select the text of all items under produce" /produce/item/text() (3 nodes selected)
"Select all the manager nodes in all departments" //department/manager (1 node selected)

Notes

The default axis in XPath is the child axis. You can change the axis by prefixing a different axis name. For example: //item/ancestor::produce
Element nodes have text values. When you evaluate an element node, its textual contents will be returned. In case of this example, /produce/item[1]/text() and string(/produce/item[1]) will be the same.
Also see this answer where I outline the individual parts of an XPath expression graphically.

138

answered Sep 25 '22 11:09

Tomalak

Related questions
                            
                                How to change XML Attribute
                            
                                How to do opposite of of preference attribute android:dependency?
                            
                                Error: "Input is not proper UTF-8, indicate encoding !" using PHP's simplexml_load_string
                            
                                How do I capture PHP output into a variable?
                            
                                XSL xsl:template match="/"
                            
                                How do I see the actual XML generated by PHP SOAP Client Class?
                            
                                Serialize Python dictionary to XML [closed]
                            
                                Parsing XML in Python using ElementTree example
                            
                                TabLayout without using ViewPager
                            
                                What does i:nil="true" mean?
                            
                                Loading System.ServiceModel configuration section using ConfigurationManager
                            
                                Why does C# XmlDocument.LoadXml(string) fail when an XML header is included?
                            
                                XMLReader from a string content
                            
                                Using Xpath With Default Namespace in C#
                            
                                Android View Clipping
                            
                                Should full backup content xml file be empty or not added at all to include all?
                            
                                Cross-Browser Javascript XML Parsing
                            
                                Why are "control" characters illegal in XML 1.0?
                            
                                "Type not expected", using DataContractSerializer - but it's just a simple class, no funny stuff?
                            
                                XML string to XML document

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With