Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Xpath - get only node content without other elements

Tags:

xpath

I have an div elemet:

<div>    This is some text    <h1>This is a title</h1>    <div>Some other content</div> </div> 

What xpath expression should I use to only get the div content without his child elements h1 and div

//div[not(h1)&not(div)]

Something like that? I cannot figure it out

like image 910
nticaric Avatar asked Dec 15 '10 22:12

nticaric


People also ask

What is text () in XPath?

The XPath text() function is a built-in function of selenium webdriver which is used to locate elements based on text of a web element. It helps to find the exact text elements and it locates the elements within the set of text nodes. The elements to be located should be in string form.

What is XPath selector?

XPath stands for XML Path Language. It uses a non-XML syntax to provide a flexible way of addressing (pointing to) different parts of an XML document. It can also be used to test addressed nodes within a document to determine whether they match a pattern or not.

What is syntax for XPath?

Syntax of XPath Below is the syntax for Xpath: Xpath =//tagname[@Attribute='value'] Wherein: //: Used to select the current node. tagname: Name of the tag of a particular node.

What is a node in XPath?

In XPath, there are seven kinds of nodes: element, attribute, text, namespace, processing-instruction, comment, and document nodes. XML documents are treated as trees of nodes. The topmost element of the tree is called the root element.


1 Answers

To get the string value of div use:

string(/div) 

This is the concatenation of all text nodes that are descendents of the (top) div element.

To select all text node descendents of div use:

/div//text() 

To get only the text nodes that are direct children of div use:

/div/text() 

Finally, get the first (and hopefully only) non-whitespace-only text node child of div:

/div/text()[normalize-space()][1] 
like image 174
Dimitre Novatchev Avatar answered Oct 05 '22 19:10

Dimitre Novatchev