I'm currently using the RubyTidy Ruby bindings for HTML tidy to make sure HTML I receive is well-formed. Currently this library is the only thing holding me back from getting a Rails application on Ruby 1.9. Are there any alternative libraries out there that will tidy up chunks of HTML on Ruby 1.9?
http://github.com/libc/tidy_ffi/blob/master/README.rdoc works with ruby 1.9 (latest version)
If you are working on windows, you need to set the library_path eg
require 'tidy_ffi'
TidyFFI.library_path = 'lib\\tidy\\bin\\tidy.dll'
tidy = TidyFFI::Tidy.new('test')
puts tidy.clean
(It uses the same dll as tidy) The above links gives you more example of the usage.
I am using Nokogiri to fix invalid html:
Nokogiri::HTML::DocumentFragment.parse(html).to_html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With