Should HTML be encoded before being stored in say, a database? Or is it normal practice to encode on its way out to the browser? Should all my text based field lengths be quadrupled in the database to allow for extra storage? Looking for best practice rather than a solid yes or no :-)

The practice is to HTML encode before display. If you are consistent about encoding before displaying, you have done a good bit of <code>XSS</code> prevention. You should save the original form in your database. This preserved the original and you may want to do other processing on that and not on the encoded version.

Should HTML be encoded before being persisted?

4 Answers

Is the data in your database really HTML or is it application data like a name or a comment that you just happen to know will end up as part of an HTML page?

If it's application data, I think its best to:

represent it in a form that native to the environment (e.g. unencoded in the database), and
make sure its properly translated as it crosses representational boundaries (encode when you generate the HTML page).

If you're a fan of MVC, this also helps separates the view/controller from the model (and from the persistent storage format).

Representation

For example, assume someone leaves the comment "I love M&Ms". Its probably easiest to represent it in the code as the plain-text String "I love M&Ms", not as the HTML-encoded String "I love M&Ms". Technically, the data as it exists in the code is not HTML yet and life is easiest if the data is represented as simply as accurately possible. This data may later be used in a different view, e.g. desktop app. This data may be stored in a database, a flat file, or in an XML file, perhaps later be shared with another program. Its simplest for the other program to assume the string is in "native" representation for the format: "I love M&Ms" in a database and flat file and "I love M&Ms" in the XML file. I would cringe to see the HTML-encoded value encoded in an XML file ("I love &amp;Ms").

Translation

Later, when the data is about to cross a representation boundary (e.g. displayed in HTML, stored in a database, plain-text file, or XML file), then its important to make sure it is properly translated so it is represented accurately in a format native to that next environment. In short, when you go to display it on an HTML page, make sure its translated to properly-encoded HTML (manually or through a tool) so the value is accurately displayed on the page. When you go to store it in the database or use it in a query, use escaping and/or prepared statements and bound variable to ensure the same conceptual value is accurately represented to the database. When you go to store it in an XML file, you ensure its XML-encoded.

Failure to translate properly when crossing representation boundaries is the source of injection attacks such SQL-injection attacks. Be conscientious of that whenever you are working with multiple representations/languages (e.g. Java, SQL, HTML, Javascript, XML, etc).

On the other hand, if you are really trying to save HTML page fragments to the database, then I am unclear by what you mean by "encoded before being stored". If its is strictly valid HTML, all the necessary values should already be encoded (e.g. &, <, etc).

177

answered Oct 26 '22 16:10

Bert F

The practice is to HTML encode before display.

If you are consistent about encoding before displaying, you have done a good bit of XSS prevention.

You should save the original form in your database. This preserved the original and you may want to do other processing on that and not on the encoded version.

answered Oct 26 '22 16:10

Oded

Database vendor specific escaping on the input, html escaping on the output.

answered Oct 26 '22 16:10

K. Norbert

I disagree with everyone who thinks it should be decoded at display time, the chances of an attack occuring if its encoded before it reaches the database is only possible if a developer purposes decodes it before displaying it. However, if you decode it before presenting it there is always a chance that it could happen by some other newbie developer, like a new hire, or a bad implementation. If its sitting there unencoded its just waiting to pop out on the internet and spread like herpes. Losing the original data shouldnt be a concern. encode + decode should produce the same data every time. Just my two cents.

answered Oct 26 '22 15:10

user2175766

Related questions
                            
                                How to display an Alert in Bootstrap Modal
                            
                                Prevent html input type=number from ever being empty
                            
                                How to remove border from elements in the last row?
                            
                                HTML 5 Doctype causing quirksmode?
                            
                                Html marquee tag
                            
                                Convert from Word document to HTML
                            
                                HTML/CSS: How to make "password" input show the password?
                            
                                Center div and fit contents?
                            
                                How can I show code (specifically C++) in an HTML page?
                            
                                How to pre-populate the Facebook status message through an URL similar to pre-populating a tweet?
                            
                                How to start new line with space for next line in Html.fromHtml for text view in android
                            
                                Hide scrollbar in firefox
                            
                                XSLT How to check if XML Node exists?
                            
                                Is there a way to set any style for a specific browser in CSS?
                            
                                Rotate all html element (whole page) 90 degree with CSS?
                            
                                How does one center-align a button using Materialize CSS?
                            
                                Responsive CSS Grid with persistent aspect ratio
                            
                                How can I set the size of icons in Ant Design?
                            
                                Are ASP.NET MVC HTML Helpers overrated?
                            
                                How to apply CSS to HTML body element?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Should HTML be encoded before being persisted?

Tags:

html

html-encode

Razor

People also ask

4 Answers

Bert F

Oded

K. Norbert

user2175766

Recent Activity

Donate For Us