Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sanitize Markdown in Rails?

Users can edit "articles" in my application. Each article is mastered in the DB and sent to the client as Markdown -- I convert it to HTML client side with Javascript.

I'm doing this so that when the user wants to edit the article he can edit and POST the Markdown right back to the server (since it's already on the page).

My question is how to sanitize the Markdown I send to the client -- can I just use Rails' sanitize helper?

Also, any thoughts on this approach in general? Another strategy I thought of was rendering and sanitizing the HTML on the server, and pulling the Markdown to the client only if the user wants to edit the article.

like image 686
Tom Lehman Avatar asked Sep 06 '09 03:09

Tom Lehman


2 Answers

I follow a couple principals:

  • store what the user types
  • sanitize on display
  • only send data that is necessary

That leads me to the alternative architecture you suggest:

  • store markdown in the database
  • on render, markdown/sanitize, and send HTML to browser
  • when (and if) the user chooses "Edit", request the raw markdown from the server via AJAX
  • if I have a "preview" view during edit, I try to use the server to render this as well (although you may need to remove this step if it's too slow). During preview, though, sanitizing may not be that critical.

This has been my approach and it works out pretty cleanly.

like image 124
ndp Avatar answered Nov 11 '22 10:11

ndp


The other answers here are good, but let me make a few suggestions on sanitization. Rails built-in sanitizer is decent, but it doesn't guarantee well-formedness which tends to be half the problem. It's also fairly likely to be exploited since it's not best-of-breed and it has a large large install footprint for hackers to attack.

I believe the best and most forward-looking sanitization around today is html5lib because it's written to parse as a browser does, and it's a collaboration by a lot of leaders in the field. However it's a bit on the slow side and not very Ruby like.

In Ruby I recommend either Loofah which lifts some of the html5 sanitization stuff verbatim, but uses Nokogiri and runs much much faster or Sanitize which has a solid test suite and very good configurability (don't shoot yourself in the foot though).

I just released a plugin called ActsAsSanitiled which is a rewrite of ActsAsTextiled to automagically sanitize the textiled output as well using the Sanitize gem. It's designed to give you the best of both worlds: input is untouched in the DB, yet the field always outputs safe HTML without needing to remember anything in the template. I don't use Markdown myself, but I would consider adding BlueCloth support.

like image 44
gtd Avatar answered Nov 11 '22 11:11

gtd