Simple question that keeps bugging me.
Should I HTML encode user input right away and store the encoded contents in the database, or should I store the raw values and HTML encode when displaying?
Storing encoded data greatly reduces the risk of a developer forgetting to encode the data when it's being displayed. However, storing the encoded data will make datamining somewhat more cumbersome and it will take up a bit more space, even though that's usually a non-issue.
HTML encoding makes sure that text is displayed correctly in the browser and not interpreted by the browser as HTML. For example, if a text string contains a less than sign (<) or greater than sign (>), the browser would interpret these characters as the opening or closing bracket of an HTML tag.
Convert All Letters Turn each textual character to an HTML entity. Decimal Radix Use a numeric character reference with a decimal code point. Don't Encode Newlines Ignore all newline characters during the encoding. Display Named Entities Print the entity reference instead of the code point value, if it exists.
i'd strongly suggest encoding information on the way out. storing raw data in the database is useful if you wish to change the way it's viewed at a certain point. the flow should be something similar to:
sanitize user input -> protect against sql injection -> db -> encode for display
think about a situation where you might want to display the information as an RSS feed instead. having to redo any HTML specific encoding before you re-display seems a bit silly. any development should always follow the "don't trust input" meme, whether that input is from a user or from the database.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With