Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Displaying user uploaded html content - Security Issue

I have a feature where user can upload html file which I then read its content via PHP and send it to the third party API as a string. Now before I send it to the API I want to generate a preview of the HTML they uploaded to the user so they can press Confirm button to send it.

The HTML files should be mostly letter templates but users can modify the html and add some script tags or inject other malicious code that might harm my website while displaying for preview. Is there a way I can avoid this?

I thought about stripping tags but what if they have onclick events within html elements?

like image 707
GGio Avatar asked Aug 19 '16 20:08

GGio


1 Answers

Id start with something like this to strip scripts and comments:

$htmlblacklist[] = '@<script[^>]*?>.*?</script>@si'; //bye bye javascript
$htmlblacklist[] = '@<![\s\S]*?--[ \t\n\r]*>@'; //goodbye comments

//now apply blacklist
$value = preg_replace($htmlblacklist, '', $value);

For inline events, you should use DOMDocument, as it understands HTML whereas Regex is shooting in the dark.

In reality, you could use DOMDocument for all of it, and not use Regex at all. Load up the HTML in a DOMDocument object, and iterate through the tree, removing what you want.

like image 158
Kovo Avatar answered Nov 17 '22 20:11

Kovo