Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How good is the Rails sanitize() method?

Can I use ActionView::Helpers::SanitizeHelper#sanitize on user-entered text that I plan on showing to other users? E.g., will it properly handle all cases described on this site?

Also, the documentation mentions:

Please note that sanitizing user-provided text does not guarantee that the resulting markup is valid (conforming to a document type) or even well-formed. The output may still contain e.g. unescaped ’<’, ’>’, ’&’ characters and confuse browsers.

What's the best way to handle this? Pass the sanitized text through Hpricot before displaying?

like image 222
Tom Lehman Avatar asked Jun 06 '10 19:06

Tom Lehman


People also ask

What does sanitize do in Rails?

Sanitizes HTML input, stripping all tags and attributes that aren't whitelisted. It also strips href/src attributes with unsafe protocols like javascript:, while also protecting against attempts to use Unicode, ASCII, and hex character references to work around these protocol filters.

Does rails sanitize input?

So as you can see, Rails in fact sanitizes it for you, so long as you pass the parameter in as a hash, or method parameter (depending on which query method you're using).

What does it mean to sanitize code?

In data sanitization, HTML sanitization is the process of examining an HTML document and producing a new HTML document that preserves only whatever tags are designated "safe" and desired.

What does the H in <%= h %> stand for?

h is just alias for html_escape. It is a utility method commonly used to escape html and javascript from user input forms.


2 Answers

Ryan Grove's Sanitize goes a lot farther than Rails 3 sanitize. It ensures the output HTML is well-formed and has three built-in whitelists:

Sanitize::Config::RESTRICTED Allows only very simple inline formatting markup. No links, images, or block elements.

Sanitize::Config::BASIC Allows a variety of markup including formatting tags, links, and lists. Images and tables are not allowed, links are limited to FTP, HTTP, HTTPS, and mailto protocols, and a attribute is added to all links to mitigate SEO spam.

Sanitize::Config::RELAXED Allows an even wider variety of markup than BASIC, including images and tables. Links are still limited to FTP, HTTP, HTTPS, and mailto protocols, while images are limited to HTTP and HTTPS. In this mode, is not added to links.

like image 135
Turadg Avatar answered Nov 02 '22 09:11

Turadg


Sanitize is certainly better than the "h" helper. Instead of escaping everything, it actually allows the html tags that you specify. And yes, it does prevent cross-site scripting because it removes javascript from the mix entirely.

In short, both will get the job done. Use "h" when you don't expect anything other than plaintext, and use sanitize when you want to allow some, or you believe people may try to enter it. Even if you disallow all tags with sanitize, it'll "pretty up" the code by removing them instead of escaping them as "h" does.

As for incomplete tags: You could run a validation on the model that passes html-containing fields through hpricot, but I think this is overkill in most applications.

like image 30
Jaime Bellmyer Avatar answered Nov 02 '22 08:11

Jaime Bellmyer