Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XPath for deeply nested elements?

I am using Nokogiri.

Suppose I have a deeply nested path:

//h1/h2/h3/h4/h5

I think I can use the following path:

//h1/*/*/*/h5

Is there any way I can avoid using multiple asterisks? Something like //h1/.../h5?

I don't want to keep counting the levels of nesting.

like image 740
nilanjan Avatar asked Nov 01 '12 03:11

nilanjan


3 Answers

If you want to select all h5 that are exactly 4 levels below their h1 ancestor, use:

//h5[ancestor::*[4][self::h1]]

XSLT - based verification:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
  <xsl:copy-of select="//h5[ancestor::*[4][self::h1]]"/>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the following XML document:

<t>
 <head/>
 <body>
  <h1>First Main title
    <a>
     <b>
       <c>
         <h5 id="id1"/>
         <d>
           <h5 id="id2"/>
         </d>
       </c>
     </b>
    </a>
  </h1>
 </body>
</t>

the XPath expression is evaluated and the result of the evaluation (the selected h1 elements (in this case just one)) is copied to the output:

<h5 id="id1"/>

If you don't want to count the number of the intermediate levels, butare sure that they don't exceed a certain number (say 7), you can write:

//h1[descendent::*[not(position() > 7)][self::h1]]

This selects any h5 descendent of any h1, where the "distance" in levels between the h1 and the descendent h5 doesn't exceed 7.

Do note:

An expression like the below -- as suggested in other answers:

//h1//h5

incorrectly selects for the above document:

<h5 id="id1"/>
<h5 id="id2"/>

The second of the two selected h5 elements is at a greater distance than the wanted one from its h1 ancestor.

like image 102
Dimitre Novatchev Avatar answered Oct 03 '22 05:10

Dimitre Novatchev


for all h5 elements that descend from an h1 use:

//h1//h5

Or you might like the simpler css style:

h1 h5
like image 28
pguardiario Avatar answered Oct 03 '22 05:10

pguardiario


Just use: //, i.e.: //h5. This XPath will select all h5 elements. See spec: http://www.w3.org/TR/xpath/#path-abbrev

like image 34
Kirill Polishchuk Avatar answered Oct 03 '22 05:10

Kirill Polishchuk