I am using PHP 5.2.6 and my app's character set is UTF-8.
Now, how should I change PHP's default character set? NOT the one which specifies output's mime time and character set.
But which will change for all the PHP function like htmlspecialchars, htmlentities, etc.
I know, there is a parameter in those functions which takes the character set of the input string. But I don't want to specify for all the functions I use. And if somewhere I forget, it will be mess.
I also know, that I can wrap those functions and create my own wrapper like:
function myHtmlize($str)
{
return htmlspecialchars($str, ENT_COMPAT, 'UTF-8');
}
I also, don't like this solution.
I really want to tell PHP, that by default take 'UTF-8' as the character set. Not 'iso-8859-1'.
Is it possible?
Difference between htmlentities() and htmlspecialchars() function: The only difference between these function is that htmlspecialchars() function convert the special characters to HTML entities whereas htmlentities() function convert all applicable characters to HTML entities.
Description. The htmlspecialchars() function is used to converts special characters ( e.g. & (ampersand), " (double quote), ' (single quote), < (less than), > (greater than)) to HTML entities ( i.e. & (ampersand) becomes &, ' (single quote) becomes ', < (less than) becomes < (greater than) becomes > ).
The htmlspecialchars() function converts some predefined characters to HTML entities.
Definition and Usage. The utf8_encode() function encodes an ISO-8859-1 string to UTF-8. Unicode is a universal standard, and has been developed to describe all possible characters of all languages plus a lot of symbols with one unique number for each character/symbol.
Like this one ? http://us2.php.net/manual/en/function.setlocale.php
* LC_ALL for all of the below
* LC_COLLATE for string comparison, see strcoll()
* LC_CTYPE for character classification and conversion, for example strtoupper()
* LC_MONETARY for localeconv()
* LC_NUMERIC for decimal separator (See also localeconv())
* LC_TIME for date and time formatting with strftime()
* LC_MESSAGES for system responses (available if PHP was compiled with libintl)
There is a C-function determine_charset(char *charset_hint ...) which is used to find the "right" charset based on
in that order and depending on whether some extensions are built-in or not.
The "problem" is, when you call htmlentities('xyz') this determine_charset() is called with charset_hint=NULL and the first this function does is:
/* Guarantee default behaviour for backwards compatibility */
if (charset_hint == NULL)
return cs_8859_1;
You have to call at least htmlentities('xyz', ENT_QUOTES, '')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With