Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do I need to use HTML entities when storing data in the database?

I need to store special characters and symbols into mysql database. So either I can store it as it is like 'ü' or convert it to html code such as 'ü'

I am not sure which would be better.

Also I am having symbols like '♥', '„' .

Please suggest which one is better? Also suggest if there is any alternative method.

Thanks.

like image 483
Vivek Vaghela Avatar asked Feb 15 '12 18:02

Vivek Vaghela


People also ask

Are HTML entities necessary?

If your pages are correctly encoded in utf-8 you should have no need for html entities, just use the characters you want directly.

What is HTML entity decode?

HTML encoding converts characters that are not allowed in HTML into character-entity equivalents; HTML decoding reverses the encoding. For example, when embedded in a block of text, the characters < and > are encoded as &lt; and &gt; for HTTP transmission.

How to convert HTML entities to characters php?

The html_entity_decode() function converts HTML entities to characters. The html_entity_decode() function is the opposite of htmlentities().

How to convert HTML to string php?

use the strip_tags()it delivery actual output what we expected ,string strip_tags ( string $str [, string $allowable_tags ] ) This function tries to return a string with all NULL bytes, HTML and PHP tags stripped from a given str. It uses the same tag stripping state machine as the fgetss() function.


2 Answers

HTML entities have been introduced years ago to transport character information over the wire when transportation was not binary safe and for the case that the user-agent (browser) did not support the charset encoding of the transport-layer or server.

As a HTML entity contains only very basic characters (&, ;, a-z and 0-9) and those characters have the same binary encoding in most character sets, this is and was very safe from those side-effects.

However when you store something in the database, you don't have these issues because you're normally in control and you know what and how you can store text into the database.

For example, if you allow Unicode for text inside the database, you can store all characters, none is actually special. Note that you need to know your database here, there are some technical details you can run into. Like you don't know the charset encoding for your database connection so you can't exactly tell your database which text you want to store in there. But generally, you just store the text and retrieve it later. Nothing special to deal with.

In fact there are downsides when you use HTML entities instead of the plain character:

  • HTML entities consume more space: &uuml; is much larger than ü in LATIN-1, UTF-8, UTF-16 or UTF-32.
  • HTML entities need further processing. They need to be created, and when read, they need to be parsed. Imagine you need to search for a specific text in your database, or any other action would need additional handling. That's just overhead.

The real fun starts when you mix both concepts. You come to a place you really don't want to go into. So just don't do it because you ain't gonna need it.

like image 116
hakre Avatar answered Oct 07 '22 13:10

hakre


Leave your data raw in the database. Don't use HTML entities for these until you need them for HTML. You never know when you may want to use your data elsewhere, not on a web page.

like image 25
Brad Avatar answered Oct 07 '22 15:10

Brad