Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XSS attack prevention

I'm developing a web app where users can response to blog entries. This is a security problem because they can send dangerous data that will be rendered to other users (and executed by javascript).

They can't format the text they send. No "bold", no colors, no nothing. Just simple text. I came up with this regex to solve my problem:

[^\\w\\s.?!()]

So anything that is not a word character (a-Z, A-Z, 0-9), not a whitespace, ".", "?", "!", "(" or ")" will be replaced with an empty string. Than every quatation mark will be replaced with: "&quot".

I check the data on the front end and I check it on my server.

Is there any way somebody could bypass this "solution"?

I'm wondering how StackOverflow does this thing? There are a lot of formatting here so they must do a good work with it.

like image 512
Colby77 Avatar asked May 06 '10 13:05

Colby77


2 Answers

If you just want simple text don't worry about filtering specific html tags. You want the equvilent to PHP's htmlspecialchars(). A good way to use this is print htmlspecialchars($var,ENT_QUOTES); This function will perform the following encodings:

'&' (ampersand) becomes '&'
'"' (double quote) becomes '"' when ENT_NOQUOTES is not set.
''' (single quote) becomes ''' only when ENT_QUOTES is set.
'<' (less than) becomes '&lt;'
'>' (greater than) becomes '&gt;'

This is solving the problem of XSS at the lowest level, and you don't need some complex library/regex that you don't understand (and is probably insecure after all complexity is the enemy of security).

Make sure to TEST YOUR XSS FILTER by running a free xss scanner.

like image 154
rook Avatar answered Oct 09 '22 04:10

rook


I agree with Tomalak, and just wanted to add a few points.

  1. Don't allow HTML tags. The idea is to treat user input as text, and html-escape characters before rendering them. Use OWASP's ESAPI project for this purpose. This page explains the various possible encodings that you should be aware of.
  2. If you have to allow HTML tags, use a library to do the filtering for you. DO NOT write your own regexe's; they are difficult to get right. Use OWASP's Anti-Samy project - it was designed specifically for this use case.
like image 36
Sripathi Krishnan Avatar answered Oct 09 '22 02:10

Sripathi Krishnan