Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP htmlentities() on input vs on output [duplicate]

Possible Duplicate:
PHP htmlentities() on input before DB insert, instead of on output

For a PHP application that's simply trying to protect itself against the likes of XSS, at what stage should the htmlentities() function be called? Should it be called on the initial user input, or on every page render where that data is outputted?

If I use htmlentities() on user input, I end up storing slightly more data in the database. However, in the long run, I save on CPU cycles because I only have to perform the conversion on input, and never again on subsequent output of that data.

I should note that I can't see any foreseeable case of ever having to store HTML input data in my application, so using htmlentities() is purely for XSS protection. In the unlikely event that I do ever need the raw HTML, I can simply call html_entity_decode() to reverse htmlentities(). Additionally, it saves me from forgetting to call htmlentities() on page render and accidentally inserting an XSS exploit into my application.

I've toyed with the idea of using Facebook's XHP extension, but the XML parsing induces quite a lot of overhead, more than what I'm comfortable with for my application.


Summary: Should I use htmlentities() on input or on output? What is the general, accepted approach to this situation?

like image 932
Richard Keller Avatar asked Jun 30 '12 17:06

Richard Keller


Video Answer


2 Answers

Unless you can guarantee that for the lifetime of your application the input is only going to be fed to a web browser the matter is not up for discussion: you should use XSS protection on output because otherwise you will end up having to massage your data on output (whatever kind of output that may be) on a case-by-case basis. Which is exactly your current argument for applying the protection on input.

Seeing as it's quite unlikely that the above is true even right now (let alone in an unspecified future time) IMHO the answer is obvious.

like image 102
Jon Avatar answered Oct 12 '22 23:10

Jon


i prefer using it at output, that keep the posibility open to use the same data for none html version of outputs.

like image 44
Puggan Se Avatar answered Oct 13 '22 00:10

Puggan Se