Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to select all elements between two nodes with XPath

Tags:

c#

c#-4.0

xpath

How I select everything (all possible nodes) between the 1st and second h2? There can be n nodes between them, and there can be m h2 tags.

The nodes aren't necessarily going to be contained in an HTML elment, so the selector can just grab them all.

<html>
 <h2>asdf</h2>
 <p>good stuff 1</p>
 <p>good stuff 2</p>
 <p>good <a href="#">asdf</a>stuff n...</p>
 <h2>qwer</h2>
 <p>test2</p>
 <h2>dfgh</h2>
 <p>test2</p>
</html>

I'm just getting my feet wet with XPath. Please help my newbie question :)

Thanks so much!

like image 758
Hoppe Avatar asked Mar 01 '12 22:03

Hoppe


2 Answers

One XPath expression that selects the wanted elements is:

   /*/h2[1]
      /following-sibling::p
        [count(. | /*/h2[2]/preceding-sibling::p)
        =
         count(/*/h2[2]/preceding-sibling::p)
        ]

In general, in such cases one can use the Kayessian formula for set intersection:

$ns1[count(.|$ns2) = count($ns2)]

This XPath expression selects all the nodes that belong both to the nodesets $ns1 and $ns2.

If you want to get all nodes between two given nodes $n1 and $n2, this is the intersection of two nodesets: $n1/following-sibling::node() and $n2/preceding-sibling::node().

Just substitute these expression into the Kayessian formula and you have the wanted XPath expression.

In XPath 2.0, of course, one would use the << or >> operator, something like:

 /*/h2[1]/following-sibling::p[. << /*/h2[1]/]
like image 126
Dimitre Novatchev Avatar answered Nov 18 '22 18:11

Dimitre Novatchev


Not sure about xpath, but you have a tag C# 4.0 so the following code does the job:

XElement.Parse(xml)
                .Element("h2")
                .ElementsAfterSelf()
                .TakeWhile(n => n.Name != "h2")
                .ToList()
like image 22
the_joric Avatar answered Nov 18 '22 20:11

the_joric