Hope someone out there can quickly point me in the right direction with my XPath difficulties.
Current I've got to the point where I'm identifying the correct table i need in my HTML source but then I need to process only the rows that have the text 'Chapter' somewhere in the DOM.
My last attempt was to do this :
// get the correct table
HtmlTable table = page.getFirstByXPath("//table[2]");
// now the failing bit....
def rows = table.getByXPath("*/td[contains(text(),'Chapter')]")
I thought the xpath above would represent, get me all elements that have a following child element of 'td' that somewhere in its dom contains the text 'Chapter'
An example of a matching row from my source is :
<tr valign="top">
<td nowrap="" align="Right">
<font face="Verdana">
<a href="index.cfm?a=1">Chapter 1</a>
</font>
</td>
<td class="ChapterT">
<font face="Verdana">DEFINITIONS</font>
</td>
<td> </td>
</tr>
Any help / pointers greatly appreciated.
Thanks,
Use this XPath:
//td[contains(., 'Chapter')]
You want all tds under your current node -- not - all in the document as the currently accepted answer selects.
Use:
.//td[.//text()[contains(., 'Chapter')]]
This selects all td descendants of the current node that are named td that have at least one text node descendant, whose string value contains the string "Chapter".
If it is known in advance that any td under this table only has a single text node, this can be simplified to just:
.//td[contains(., 'Chapter')]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With