I'm grabbing a div of text from a url and would like to remove everything underneath a paragraph which has a backtotop
class. I'd seen a traverse snippet of code here on stackoverflow which looks promising, but I can't figure out how to get it incorporated so @el only contains everything up to the first p.backtotop
in the div.
my code:
@doc = Nokogiri::HTML(open(url))
@el = @doc.css("div")[0]
end
traverse snippet:
doc = Nokogiri::HTML(code)
stop_node = doc.css("p.backtotop")
doc.traverse do |node|
break if node == stop_node
# else, do whatever, e.g. `puts node.name`
end
Write a function that takes a list sorted in non-decreasing order and deletes any duplicate nodes from the list. The list should only be traversed once. For example if the linked list is 11->11->11->21->43->43->60 then removeDuplicates() should convert the list to 11->21->43->60.
Write a removeDuplicates() function that takes a list and deletes any duplicate nodes from the list. The list is not sorted. For example if the linked list is 12->11->12->21->41->43->21 then removeDuplicates() should convert the list to 12->11->21->41->43.
We need to first check for all occurrences at the head node and change the head node appropriately. Then we need to check for all occurrences inside a loop and delete them one by one.
For example:
<body>
<div id="a">
<h2>My Section</h2>
<p class="backtotop">Back to Top</p>
<p>More Content</p>
<p>Even More Content</p>
</div>
</body>
require 'nokogiri'
doc = Nokogiri::HTML(my_html)
div = doc.at('#a')
div.at('.backtotop').xpath('following-sibling::*').remove
puts div
#=> <div id="a">
#=> <h2>My Section</h2>
#=> <p class="backtotop">Back to Top</p>
#=>
#=>
#=> </div>
Here's a more complicated example, where the backtotop
item may not be at the root of the div:
<body>
<div id="b">
<h2>Another Section</h2>
<section>
<p class="backtotop">Back to Top</p>
<p>More Content</p>
</section>
<p>Even More Content</p>
</div>
</body>
require 'nokogiri'
doc = Nokogiri::HTML(my_html)
div = doc.at('#b')
n = div.at('.backtotop')
until n==div
n.xpath('following-sibling::*').remove
n = n.parent
end
puts div
#=> <div id="b">
#=> <h2>Another Section</h2>
#=> <section><p class="backtotop">Back to Top</p>
#=>
#=> </section>
#=> </div>
If your HTML is more complicated than the above then please provide an actual sample along with the result you want. This is good advice for any future question you ask.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With