Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strip html from string Ruby on Rails

People also ask

How do I strip a string in HTML?

To strip out all the HTML tags from a string there are lots of procedures in JavaScript. In order to strip out tags we can use replace() function and can also use . textContent property, . innerText property from HTML DOM.

How do I strip a tag in HTML?

The strip_tags() function strips a string from HTML, XML, and PHP tags. Note: HTML comments are always stripped. This cannot be changed with the allow parameter. Note: This function is binary-safe.


If we want to use this in model

ActionView::Base.full_sanitizer.sanitize(html_string)

which is the code in "strip_tags" method


There's a strip_tags method in ActionView::Helpers::SanitizeHelper:

http://api.rubyonrails.org/classes/ActionView/Helpers/SanitizeHelper.html#method-i-strip_tags

Edit: for getting the text inside the value attribute, you could use something like Nokogiri with an Xpath expression to get that out of the string.


Yes, call this: sanitize(html_string, tags:[])


ActionView::Base.full_sanitizer.sanitize(html_string)

White list of tags and attributes can be specified as bellow

ActionView::Base.full_sanitizer.sanitize(html_string, :tags => %w(img br p), :attributes => %w(src style))

Above statement allows tags img, br and p and attributes src and style.


I've used the Loofah library, as it is suitable for both HTML and XML (both documents and string fragments). It is the engine behind the html sanitizer gem. I'm simply pasting the code example to show how simple it is to use.

Loofah Gem

unsafe_html = "ohai! <div>div is safe</div> <script>but script is not</script>"

doc = Loofah.fragment(unsafe_html).scrub!(:strip)
doc.to_s    # => "ohai! <div>div is safe</div> "
doc.text    # => "ohai! div is safe "

How about this?

white_list_sanitizer = Rails::Html::WhiteListSanitizer.new
WHITELIST = ['p','b','h1','h2','h3','h4','h5','h6','li','ul','ol','small','i','u']


[Your, Models, Here].each do |klass| 
  klass.all.each do |ob| 
    klass.attribute_names.each do |attrs|
      if ob.send(attrs).is_a? String
        ob.send("#{attrs}=", white_list_sanitizer.sanitize(ob.send(attrs), tags: WHITELIST, attributes: %w(id style)).gsub(/<p>\s*<\/p>\r\n/im, ''))
        ob.save
      end
    end
  end
end