Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does htmlentities with ENT_QUOTES and UTF-8 do?

I have always used simple htmlentities($_POST['string']); to clean data for any XSS attacks. Recently I have seen people use this:

htmlentities($_POST['string'], ENT_QUOTES, 'UTF-8');

What is the advantage or purpose of using that over just htmlentities().

Also don't know if it is relevant but I use meta UTF-8 always at the top of my pages.

like image 912
Sameer Zahid Avatar asked Jun 01 '13 07:06

Sameer Zahid


People also ask

What is the purpose of HTML entities () function?

The htmlentities() function converts characters to HTML entities. Tip: To convert HTML entities back to characters, use the html_entity_decode() function. Tip: Use the get_html_translation_table() function to return the translation table used by htmlentities().

What does UTF-8 encoding do?

A Unicode-based encoding such as UTF-8 can support many languages and can accommodate pages and forms in any mixture of those languages. Its use also eliminates the need for server-side logic to individually determine the character encoding for each page served or each incoming form submission.

What is the difference between Addslashes () and HTML entities () in terms of functionality?

They are different tools for different purposes. mysqli_real_escape_string makes data safe for inserting into MySQL (but parametrized queries are better). addslashes assumes everything is 8bit. mysql_real_escape_string takes the character encoding into account when doing its encoding.

What is the use of UTF-8 in HTML?

The HTML5 Standard: Unicode UTF-8 Unicode enables processing, storage, and transport of text independent of platform and language. The default character encoding in HTML-5 is UTF-8.


1 Answers

ENT_QUOTES is needed if the data is being substituted into an HTML attribute, e.g.

echo '<input type="text" value="' . htmlentities($string, ENT_QUOTES) . '">";

This ensures that quotes are encoded, so they won't terminate the value="..." attribute prematurely.

UTF-8 is necessary if your page uses UTF-8 charset, because the default is to use ISO-8859-1 encoding. These encodings need to match or the user will see strange characters.

like image 134
Barmar Avatar answered Sep 28 '22 11:09

Barmar