Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HTML Parser into DOM in Ruby

Is there any HTML parser in Ruby that reads HTML document into a DOM Tree and represents HTML tags as DOM elements?

I know Nokogiri but it doesn't parse HTML into DOM tree.

like image 200
u19964 Avatar asked Apr 08 '26 21:04

u19964


1 Answers

Despite your remark, Nokogiri is the way to go:

doc = Nokogiri::HTML('<body><p>Hello, worlds!</body>')

It parses even invalid HTML and returns a DOM tree:

>> doc.class
=> Nokogiri::HTML::Document
>> doc.root.class
=> Nokogiri::XML::Element
>> doc.root.children.class
=> Nokogiri::XML::NodeSet
>> doc.root.children.first.content
=> "Hello, worlds!"
like image 92
akuhn Avatar answered Apr 11 '26 16:04

akuhn



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!