How do construct an xpath to select items that do not contain a string



How do I use something similar to the example below, but with the opposite result, i.e items that do not contain (default) in the text.

     <item>Some text (default)</item>
     <item>Some more text</item>
     <item>Even more text</item>

Given this

//test/item[contains(text(), '(default)')]

would return the first item. Is there a not operator that I can use with contains?

1 Answers

Yes, there is:

//test/item[not(contains(text(), '(default)'))]

Hint: not() is a function in XPath instead of an operator.

An alternative, possibly better way to express this is:

//test/item[not(text()[contains(., '(default)')])]

There is a subtle but important difference between the two expressions (let's call them A and B, respectively).

Simple case: If all <item> only have a single text node child, both A and B behave the same.

Complex case: If <item> can have multiple text node children, expression A only matches when '(default)' occurs in the first of them.

This is because text() matches all text node children and produces a node-set. So far no surprise. Now, contains() accepts a node-set as its first argument, but it needs to convert it to string to do its job. And conversion from node-set to string only produces the string value of the first node in the set, all other nodes are disregarded (try string(//item) to see what I mean). In the simple case this exactly what happens as well, but the result is not as surprising.

Expression B deals with this by explicitly checking every text node individually instead of only checking the string value of the whole <item> element. It's therefore the more robust of the two.

