Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I textile and sanitize html?

Now i ran into some stupid situation. I want the users to be able to use textile, but they shouldn't mess around with my valid HTML around their entry. So I have to escape the HTML somehow.

  • html_escape(textilize("</body>Foo")) would break textile while

  • textilize(html_escape("</body>Foo")) would work, but breaks various Textile features like links (written like "Linkname":http://www.wheretogo.com/), since the quotes would be transformed into &quot; and thus not detected by textile anymore.

  • sanitize doesn't do a better job.

Any suggestions on that one? I would prefer not to use Tidy for this problem. Thanks in advance.

like image 778
Marcel Jackwerth Avatar asked Feb 01 '09 22:02

Marcel Jackwerth


2 Answers

For those who run into the same problem: If you are using the RedCloth gem you can just define your own method (in one of your helpers).

def safe_textilize( s )
  if s && s.respond_to?(:to_s)
    doc = RedCloth.new( s.to_s )
    doc.filter_html = true
    doc.to_html
  end
end

Excerpt from the Documentation:

Accessors for setting security restrictions.

This is a nice thing if you‘re using RedCloth for formatting in public places (e.g. Wikis) where you don‘t want users to abuse HTML for bad things.

If filter_html is set, HTML which wasn‘t created by the Textile processor will be escaped. Alternatively, if sanitize_html is set, HTML can pass through the Textile processor but unauthorized tags and attributes will be removed.

like image 128
Marcel Jackwerth Avatar answered Oct 27 '22 05:10

Marcel Jackwerth


This works for me and guards against every XSS attack I've tried including onmouse... handlers in pre and code blocks:

<%= RedCloth.new( sanitize( @comment.body ), [:filter_html, :filter_styles, :filter_classes, :filter_ids] ).to_html -%>

The initial sanitize removes a lot of potential XSS exploits including mouseovers.

As far as I can tell :filter_html escapes most html tags apart from code and pre. The other filters are there because I don't want users applying any classes, ids and styles.

I just tested my comments page with your example

"</body>Foo" 

and it completely removed the rogue body tag

I am using Redcloth version 4.2.3 and Rails version 2.3.5

like image 2
Noel Walters Avatar answered Oct 27 '22 05:10

Noel Walters