Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use Nokogiri to get all nodes in an element that contain a specific attribute name

I'd like to use Nokogiri to extract all nodes in an element that contain a specific attribute name.

e.g., I'd like to find the 2 nodes that contain the attribute "blah" in the document below.

@doc = Nokogiri::HTML::DocumentFragment.parse <<-EOHTML
<body>
  <h1 blah="afadf">Three's Company</h1>
  <div>A love triangle.</div>
   <b blah="adfadf">test test test</b>
</body>
EOHTML

I found this suggestion (below) at this website: http://snippets.dzone.com/posts/show/7994, but it doesn't return the 2 nodes in the example above. It returns an empty array.

# get elements with attribute:
elements = @doc.xpath("//*[@*[blah]]")

Thoughts on how to do this?

Thanks! I found this here

like image 685
user141146 Avatar asked Sep 03 '10 20:09

user141146


2 Answers

elements = @doc.xpath("//*[@*[blah]]")

This is not a useful XPath expression. It says to give you all elements that have attributes that have child elements named 'blah'. And since attributes can't have child elements, this XPath will never return anything.

The DZone snippet is confusing in that when they say

elements = @doc.xpath("//*[@*[attribute_name]]")

the inner square brackets are not literal... they're there to indicate that you put in the attribute name. Whereas the outer square brackets are literal. :-p

They also have an extra * in there, after the @.

What you want is

elements = @doc.xpath("//*[@blah]")

This will give you all the elements that have an attribute named 'blah'.

like image 109
LarsH Avatar answered Sep 29 '22 21:09

LarsH


You can use CSS selectors:

elements = @doc.css "[blah]"
like image 37
Daniel O'Hara Avatar answered Sep 29 '22 20:09

Daniel O'Hara