I'm confused about what's going on in the Nokogiri docs.
As far as I can tell, if
require 'nokogiri'
some_html = "<html><body><h1>Mr. Belvedere Fan Club</h1></body></html>"
then these three lines do the same thing:
html_doc = Nokogiri::HTML::Document.parse(some_html)
html_doc = Nokogiri::HTML.parse(some_html)
html_doc = Nokogiri::HTML(some_html)
The second is just a convenience method for the first. But to my non-Ruby eyes, the third looks like it's passing an argument to a module, not a method. I realize that Ruby has constructors, but I thought they took the form Class.new, not Module(args). What's going on here?
It's just syntax sugar, look at the Nokogiri::HTML module definition:
module Nokogiri
class << self
###
# Parse HTML. Convenience method for Nokogiri::HTML::Document.parse
def HTML thing, url = nil, encoding = nil, options = XML::ParseOptions::DEFAULT_HTML, &block
Nokogiri::HTML::Document.parse(thing, url, encoding, options, &block)
end
end
module HTML
class << self
###
# Parse HTML. Convenience method for Nokogiri::HTML::Document.parse
def parse thing, url = nil, encoding = nil, options = XML::ParseOptions::DEFAULT_HTML, &block
Document.parse(thing, url, encoding, options, &block)
end
####
# Parse a fragment from +string+ in to a NodeSet.
def fragment string, encoding = nil
HTML::DocumentFragment.parse string, encoding
end
end
# Instance of Nokogiri::HTML::EntityLookup
NamedCharacters = EntityLookup.new
end
end
First, they define a class method at the Nokogiri module called HTML (yes, Ruby allows you to do that), then they define the module Nokogiri::HTML and in there they define the class method parse.
Most people don't know but the :: operator can also be used to perform method calls:
"my_string"::size #will print 9
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With