Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XPath Subset Selection

Tags:

html

xml

xpath

I have the following XML (which is actually HTML):

<html>
    <h4>something</h4>
    <p>a</p>
    <p>b</p>
    <h4>otherthing</h4>
    <p>c</p>
</html>

Can a XPath selects the "p" nodes that are following siblings of the first "h4" node but not following siblings of second "h4" node (selecting "p" node a & b only)?

like image 445
mrkschan Avatar asked Feb 27 '23 09:02

mrkschan


2 Answers

My take

//p[preceding-sibling::h4[1] and not(preceding-sibling::h4[position() > 1])]

finds all p elements which are siblings of the first h4 but not siblings of any other h4 on the same axis

Alternative

//h4[1]/following-sibling::p[count(preceding-sibling::h4) = 1]

finds all following p element of the first h4 element that do have exactly one preceding h4 element

like image 94
Gordon Avatar answered Mar 12 '23 12:03

Gordon


Use:

/*/h4[1]/following-sibling::p
            [not(count(preceding-sibling::* | /*/h4[2])
                =
                 count(preceding-sibling::*)
                 )
             ]

More generally, the intersection of two nodesets $ns1 and $ns2 is selected by:

$ns1[count(.|$ns2) = count($ns2)]

The fact that a node $n is not in a node-set $ns1 is expressed by:

not(count($n | $ns1) = count($ns1))

This is fundamental set theory and usage of the standard XPath | (union operator and not() function.

like image 35
Dimitre Novatchev Avatar answered Mar 12 '23 12:03

Dimitre Novatchev