Trying to extract attribute values using Nokogiri with custom pseudoclass CSS selectors

Question

Having loaded a (X)HTML page, I'm trying to get the value of a meta tag's "content" attribute. For example, given:

<meta name="author" content="John Smith" />

I'd like to extract the value "John Smith".

I know how to do that using XPath and understand that CSS was meant primarily for element selection but Nokogiri supports defining custom CSS pseudoclasses which I thought could be used as follows:

class CSSext
  def attr(nodeset, tag)
    nodeset.first.attribute_nodes.find_all {|node| node.name == tag}
  end
end

doc = Nokogiri::HTML(open(someurl))
doc.css("meta[name='name']:attr('content')", CSSext.new)

However, this returns the same result as

doc.css("meta[name='name']")

What gives? Nokogiri uses the same engine underneath for both CSS and XPath searches so anything that's possible in XPath should be doable in CSS. How should I go about extracting the attribute value?

akuhn · Accepted Answer

Why not just?

doc.at("meta[name='author']")['content']

As far as I understand, pseudoclasses can be used to filter the nodeset only, but not to replace the nodeset with some other value such as the value of one of the nodes's attribute.

Trying to extract attribute values using Nokogiri with custom pseudoclass CSS selectors

Tags:

html

css

css-selectors

ruby

nokogiri

user1955506

1 Answers

akuhn

Recent Activity

Donate For Us

Trying to extract attribute values using Nokogiri with custom pseudoclass CSS selectors

Tags:

html

css

css-selectors

ruby

nokogiri

user1955506

1 Answers

akuhn

Related questions

Recent Activity

Donate For Us