Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Markdown and XSS

Ok, so I have been reading about markdown here on SO and elsewhere and the steps between user-input and the db are usually given as

  1. convert markdown to html
  2. sanitize html (w/whitelist)
  3. insert into database

but to me it makes more sense to do the following:

  1. sanitize markdown (remove all tags - no exceptions)
  2. convert to html
  3. insert into database

Am I missing something? This seems to me to be pretty nearly xss-proof

like image 599
psb Avatar asked Nov 06 '09 21:11

psb


People also ask

Is Markdown safe from XSS?

The basic rule is this: filter for XSS after Markdown has processed any input, not before. If you filter before, it'll break some of Markdown's features and will leave security holes. Also take note that even if you use PHP Markdown in no markup mode, where it strips HTML tags, you aren't safe from XSS.

What is Markdown XSS?

Script and XSS Markdown is just a markup language that happens to render HTML output. There's no tooling directly associated with Markdown the spec and there are no rules about how HTML should handle dangerous code.

Is Markdown safer than HTML?

If all that weren't enough, the fact that Markdown is a superset of HTML makes it a security risk: when you add HTML tags to Markdown, it is susceptible to XSS attacks. Unlike normal HTML, Markdown is unescaped, stripping away much of the ability to protect against these attacks.


4 Answers

Please see this link:

http://michelf.com/weblog/2010/markdown-and-xss/

> hello <a name="n"
> href="javascript:alert('xss')">*you*</a>

Becomes

<blockquote>
 <p>hello <a name="n"
 href="javascript:alert('xss')"><em>you</em></a></p>
</blockquote>

∴​ you must sanitize after converting to HTML.

like image 132
Jordan Reiter Avatar answered Jan 02 '23 20:01

Jordan Reiter


There are two issues with what you've proposed:

  1. I don't see a way for your users to be able to format posts. You took advantage of Markdown to provide nice numbered lists, for example. In the proposed no-tags-no-exceptions world, I'm not seeing how the end user would be able to do such a thing.
  2. Considerably more important: When using Markdown as the "native" formatting language, and whitelisting the other available tags,you are limiting not just the input side of the world, but the output as well. In other words, if your display engine expects Markdown and only allows whitelisted content out, even if (God forbid) somebody gets to the database and injects some nasty malware-laden code into a bunch of posts, the actual site and its users are protected because you are sanitizing it upon display, as well.

There are some good resources on the web about output sanitization:

  • Sanitizing user data: Where and how to do it
  • Output sanitization (One of my clients, who shall remain nameless and whose affected system was not developed by me, was hit with this exact worm. We have since secured those systems, of course.)
  • BizTech: Best Practices: Never heard of XSS?
like image 31
John Rudy Avatar answered Jan 02 '23 19:01

John Rudy


Well certainly removing/escaping all tags would make a markup language more secure. However the whole point of Markdown is that it allows users to include arbitrary HTML tags as well as its own forms of markup(*). When you are allowing HTML, you have to clean/whitelist the output anyway, so you might as well do it after the markdown conversion to catch everything.

*: It's a design decision I don't agree with at all, and one that I think has not proven useful at SO, but it is a design decision and not a bug.

Incidentally, step 3 should be ‘output to page’; this normally takes place at the output stage, with the database containing the raw submitted text.

like image 23
bobince Avatar answered Jan 02 '23 21:01

bobince


  1. insert into database
  2. convert markdown to html
  3. sanitize html (w/whitelist)

perl

use Text::Markdown ();
use HTML::StripScripts::Parser ();

my $hss = HTML::StripScripts::Parser->new(
   {
       Context         => 'Document',
       AllowSrc        => 0,
       AllowHref       => 1,
       AllowRelURL     => 1,
       AllowMailto     => 1,
       EscapeFiltered  => 1,
   },
   strict_comment => 1,
   strict_names   => 1,
);

$hss->filter_html(Text::Markdown::markdown(shift))
like image 33
Shinichiro Aska Avatar answered Jan 02 '23 19:01

Shinichiro Aska