we have a web application where we allow users to enter their own html in a text area. We save that data to our database.
When we load the html data into the text area, of course, we use htmlentities() before throwing the html data into the textarea. Otherwise users could save inside the textarea and our application would break when loading that into the textarea.
this works great, except when entering Chinese characters (and probably other languages such as Arabic, Japanese).
The htmlentities() makes the chinese text unusable like this: �¨�³�¼�§ï When I remove the htmlentities() before loading the entered html into the text area, Chinese characters show up just fine, but then we have the problem of HTML interfering with our textarea, especially when a users enters inside the text area.
I hope that makes sense.
Does anyone know how we can safely and correctly allow languages such as Chinese, Japanese, ... to be used inside our text area, while still being safe for loading any html inside our text area?
Have you tried using htmlspecialchars?
I currently use that in production and it's fine.
$foo = "我的名字叫萨沙"
echo '<textarea>' . htmlspecialchars($foo) . '</textarea>';
Alternately,
$str = “你好”;
echo mb_convert_encoding($str, ‘UTF-8′, ‘HTML-ENTITIES’);
As found on http://www.techiecorner.com/129/php-how-to-convert-iso-character-htmlentities-to-utf-8/
Specify charset, e.g. UTF-8 and it should work.
echo htmlentities($data, ENT_COMPAT, 'UTF-8');
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With