I am parsing a webpage that includes a structure like this:
<tr>
<td>Label 1</td>
<td>Label 2</td>
<td>Label 3</td>
<td>Something else</td>
<\tr>
<tr>
<td>Item 1</td>
<td>Item 2</td>
<td>Item 3</td>
<\tr>
What I need to do is select an item based on it's label, so my thought is if the label is in the 3rd tag in it's row, I can grab the 3rd tag in the next row to find the item. I can't figure out a way to use the position() function in this way, and maybe xpath (1.0) is unable to handle this type of filtering.
My best attempt so far is: //td[ancestor::tr[1]/preceding-sibling::tr[1]/td[position()]]
. I was hoping the position() function would grab the position of the <td>
at the beginning of the xpath, since the rest of the xpath is a filter for that node.
Is what I'm trying to do even possible?
You're on the right track -- yes, you can use position()
along with count()
.
To select the text Item 2
given Label 2
:
//td[. = 'Label 2']/../following-sibling::tr/td[position() = count(//td[. = 'Label 2']/preceding-sibling::td)+1]/text()
Explanation: Select the nth cell where n is given by the number of sibling cells that exist before the cell that has the desired label in the previous row. In effect, use the count()
function to determine position in the label row and then select the corresponding cell in the next row down by matching against its position()
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With