I have a title doc.at('head/title').inner_html
that comes out &
and it should be &
.
My original document is:
<head><title>Foo & Bar</title></head>
but in comes out as the following:
>> doc = Nokogiri::HTML.parse(file, nil, "UTF-8")
>> doc.at('head/title')
=> #<Nokogiri::XML::Element:0x..fdb851bea name="title" children=#<Nokogiri::XML::Text:0x..fdb850808 "Foo & Bar">>
>> doc.at('head/title').inner_html
=> "Foo & Bar"
I don't want to use Iconv or CGI like:
>> require 'cgi'
>> CGI.unescapeHTML(doc.at('head/title').inner_html)
=> "Foo & Bar"
that is ugly and inconvenient.
Use content
instead of inner_html
to get the content as plain text instead of (X)HTML.
irb(main):011:0> doc.at('head/title').content
=> "Foo & Bar"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With