Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In Nokogiri, how do I find all the nodes that come before a certain node in my document?

Using Rails 5, Ruby 2.4. If I have located a node using Nokogiri parsing, how would I find all the nodes that occur before my found node that do not also include that found node? That is, let's say my document is

<outer>
    <p>Hello</p>
    <inner>
        <most_inner class="abc">Howdy</most_inner>
        <most_inner class="def">Next</most_inner>
    </inner>
</outer>

and I run a query like

node = doc.search('//*[contains(@class, "def")]').first

How would I locate all the preceding nodes (that don't include the one I just identified)? The nodes I would expect would be

<p>Hello</p>
<most_inner>Howdy</most_inner>
like image 363
Dave Avatar asked Apr 04 '17 22:04

Dave


1 Answers

You just need to iterate over the leaf nodes until you reach the target node.

# Node to exclude
node = doc.search('//*[contains(@class, "def")]').first
preceding_nodes = []

# Find all leaf nodes
leaf_nodes = doc.xpath("//*[not(child::*)]")

leaf_nodes.each do |leaf|
  if leaf == node
    break
  else
    preceding_nodes.push(leaf)
  end
end

preceding_nodes # => Contains all preceding leaf nodes
like image 108
fny Avatar answered Sep 18 '22 04:09

fny