Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How in Django/Python can I ensure safety from WYSIWYG-entered HTML?

I would like to remove vulnerabilities to XSS / JavaScript injection in a web application where users are allowed to use an editor like CKEditor which allows arbitrary HTML (and whether my specific choice of editor allows arbitrary HTML or not, blackhats will be able to submit arbitrary HTML anyway). So no JavaScript, whether SCRIPT tags, ONCLICK and family, or whatever else. The target platform is Python and Django.

What are my best options here? I am open to an implementation that would whitelist tags and attributes; that is to say I don't see it as necessary to allow a user to submit everything that you can build in HTML while only JavaScript gets removed. I am happy to have rich text with supported tag availability that can allow fairly expressive rich text. I would also be open to an editor that produces Markdown, and strip all HTML tags before the data is saved. (HTML manipulation seems simpler, but I would also consider Markdown-implemented solutions.)

I also don't consider it necessary to produce a sanitized text if instead an exception is thrown that says that a submission has failed testing. (Ergo, lowercasing the string, and searching for '<script', 'onclick', etc. might be sufficient.)

Probably my first choice in a solution, if I have the choice, would be a whitelist of tag and attribute names.

What are the best solutions, if any, that are out there?

like image 319
Christos Hayward Avatar asked Mar 10 '17 18:03

Christos Hayward


People also ask

How Django is secured?

Django's querysets are protected from SQL injection since their queries are constructed using query parameterization. A query's SQL code is defined separately from the query's parameters. Since parameters may be user-provided and therefore unsafe, they are escaped by the underlying database driver.

Does Django sanitize inputs?

Django HTML Sanitizer provides a set of utilities to easily sanitize/escape/clean HTML inputs in django. This app is built on top of bleach, the excellent Python HTML sanitizer.


1 Answers

If you choose to use a WYSIWYG editor that produces HTML, using bleach on the server to sanitize your HTML (via whitelisting) is probably enough.

If you choose to use a markdown (or another non-html markup) editor, you will also probably save the markdown source and generate and sanitize the html (after generation!) on the server side. This allows you to keep markdown as is (with inline html etc.) as html is sanitized post rendering. However, if your client-side editor supports preview, you would also need to be very careful regarding in browser rendering when markdown is loaded from the server! Most markdown editors include client side sanitizers for this purpose.

like image 151
Udi Avatar answered Sep 22 '22 03:09

Udi