Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do construct an xpath to select items that do not contain a string

Tags:

xpath

How do I use something similar to the example below, but with the opposite result, i.e items that do not contain (default) in the text.

<test>
     <item>Some text (default)</item>
     <item>Some more text</item>
     <item>Even more text</item>
</test>

Given this

//test/item[contains(text(), '(default)')]

would return the first item. Is there a not operator that I can use with contains?

like image 542
Danny Avatar asked Apr 20 '10 17:04

Danny


People also ask

How will you write XPath if the tag has only text?

We will start to write XPath with //, followed by input tag, then we will use the select attribute, followed by its attribute name like name, id, etc, and then we will choose a value of attribute in single quotes. Here, (1 of 1) means exact match. It indicates that there is only one element available for this XPath.

What is text () in XPath?

XPath text() function is a built-in function of the Selenium web driver that locates items based on their text. It aids in the identification of certain text elements as well as the location of those components within a set of text nodes. The elements that need to be found should be in string format.

How do I search for text in XPath?

So, inorder to find the Text all you need to do is: driver. findElement(By. xpath("//*[contains(text(),'the text you are searching for')]"));


1 Answers

Yes, there is:

//test/item[not(contains(text(), '(default)'))]

Hint: not() is a function in XPath instead of an operator.


An alternative, possibly better way to express this is:

//test/item[not(text()[contains(., '(default)')])]

There is a subtle but important difference between the two expressions (let's call them A and B, respectively).

Simple case: If all <item> only have a single text node child, both A and B behave the same.

Complex case: If <item> can have multiple text node children, expression A only matches when '(default)' occurs in the first of them.

This is because text() matches all text node children and produces a node-set. So far no surprise. Now, contains() accepts a node-set as its first argument, but it needs to convert it to string to do its job. And conversion from node-set to string only produces the string value of the first node in the set, all other nodes are disregarded (try string(//item) to see what I mean). In the simple case this exactly what happens as well, but the result is not as surprising.

Expression B deals with this by explicitly checking every text node individually instead of only checking the string value of the whole <item> element. It's therefore the more robust of the two.

like image 156
Tomalak Avatar answered Oct 08 '22 00:10

Tomalak