Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

xpath for immediate preceding sibling

XML

<root>
  <p>nodea text 1</p>
  <p>nodea text 2</p>
  <nodea>
  </nodea>
  <p>nodeb text 1</p>
  <p>nodeb text 2</p>
  <nodeb>
  </nodeb>
</root>

I want to get the first preceding sibling p tag of nodea or nodeb if there is one. For example for the above xml the preceding siblings for respective node are

nodea preceding siblings

<p>nodea text 1</p>
<p>nodea text 2</p>

nodeb preceding siblings

<p>nodeb text 1</p>
<p>nodeb text 2</p>

i have tried the below xpath but it gives me the preceding p tag of nodea instead of nodeb.

nodeb = xml.find('nodeb')
nodeb.xpath('preceding-sibling::p[not(preceding-sibling::nodea)][1]')

If there is no preceding p tag before the node then it should return empty list. For example for the below xml there are no preceding sibling p tags for nodeb.

<root>
  <p>nodea text 1</p> 
  <nodea>
  </nodea>
  <nodeb>
  </nodeb>
</root>

It would be nice if someone can also explain why my xpath is not working and what should i keep in mind when writing xpath?

like image 669
MA1 Avatar asked Jan 05 '23 10:01

MA1


1 Answers

You can select preceding-sibling::*[1][self::p] to select the preceding sibling element if it is a p element.

As for your attempt, I think if you select the nodeb element, you then want to select preceding-sibling::p[preceding-sibling::nodea][1] as you want to look at the sibling ps that are between the nodeb and the nodea element. Your condition preceding-sibling::p[not(preceding-sibling::nodea)][1] indeed selects p siblings that don't have a preceding nodea sibling and these are the first two p elements in document order.

like image 200
Martin Honnen Avatar answered Jan 13 '23 09:01

Martin Honnen