Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HTML tidy/cleaning in Ruby 1.9

I'm currently using the RubyTidy Ruby bindings for HTML tidy to make sure HTML I receive is well-formed. Currently this library is the only thing holding me back from getting a Rails application on Ruby 1.9. Are there any alternative libraries out there that will tidy up chunks of HTML on Ruby 1.9?

like image 380
Christian Avatar asked Aug 20 '09 20:08

Christian


2 Answers

http://github.com/libc/tidy_ffi/blob/master/README.rdoc works with ruby 1.9 (latest version)

If you are working on windows, you need to set the library_path eg

    require 'tidy_ffi'
    TidyFFI.library_path = 'lib\\tidy\\bin\\tidy.dll'
    tidy = TidyFFI::Tidy.new('test')
    puts tidy.clean

(It uses the same dll as tidy) The above links gives you more example of the usage.

like image 168
surajz Avatar answered Sep 29 '22 12:09

surajz


I am using Nokogiri to fix invalid html:

  Nokogiri::HTML::DocumentFragment.parse(html).to_html
like image 35
Laurynas Avatar answered Sep 29 '22 11:09

Laurynas