What does contains() do in XPath?

Question

I have two almost identical tables, the only difference being the input tag in the first one:

Table #1

  <table>
    <tbody>
      <tr>
        <td>
          <div>
            <input type="text" name="" value=""/>
          </div>
        </td>
      </tr>
    </tbody>
  </table>

Table #2

  <table>
    <tbody>
      <tr>
        <td>
          <div></div>
        </td>
      </tr>
    </tbody>
  </table>
</body>

When I use this XPath //table//tbody//tr[position()=1 and contains(.,input)] it returns both tables' 1st row, not just the 1st table 1st row as I expect.

However, this XPath //table//tbody//tr[position()=1]//input returns just the input in the first one.

So, what am I doing wrong? Why the same input is associated with both tables? Am I misusing the . here somehow?

kjhughes · Accepted Answer

Due to an unfortunate choice in function names¹, many people mistake the purpose of the contains() function in XPath:

XPath contains() does not check for element containment.
XPath contains() checks for substring containment.

Therefore, tr[contains(.,input)] doesn't do what you think it does. It actually selects tr elements whose string-value contains a substring equal to the string-value of the first immediate child input element; see this answer for further details. (Interestingly, such a predicate simplifies to true because the hierarchical nature of the definition of string-value implies substring containment between string values of parent and child elements.) Anyway, that's clearly not your intent.

To check for descendant element containment, use .//input instead. This can be placed as a predicate of tr as your first XPath attempted to do, if it's tr elements that you wish to select,

//table//tbody//tr[position()=1 and .//input]

or table (as shown by @Andersson), if it's really table elements that you wish to select that contain an input descendant element:

//table[.//input]

Why XPath contains() should have been named string-contains()

¹In the context of XML, which is so strongly based upon the notion of hierarchy, it is natural to assume that contains refers to hierarchical containment. Of the 24 times the word contains appears in the original XPath specification, 19 times it means hierarchical node containment; only 5 times does it mean substring containment. It's no wonder that confusion over contains() exists. The XPath substring contains() function should have been named string-contains().

Andersson · Answer

You should try

//table[.//input]

to fetch table node that has input descendant

What does contains() do in XPath?

Tags:

xml

xpath

ephemeris

2 Answers

kjhughes

Andersson

Recent Activity

Donate For Us

What does contains() do in XPath?

Tags:

xml

xpath

ephemeris

2 Answers

kjhughes

Andersson

Related questions

Recent Activity

Donate For Us