I am using CKEditor for letting the user to post comments, user can also put the unicode characters in the comment box.
When I submit the Form and Check the $_POST["reply"], the unicode characters are shown very well. I have also used header('Content-type:text/html; charset=utf-8');
at the top of the page
But When I process it using PHP DOMDocument, all the characters become unreadable.
$html_unicode = "xyz unicode data";
$html_data = '<body>'.$html_unicode . '</body>';
$dom = new DOMDocument();
$dom->loadHTML($html_data );
$elements = $dom->getElementsByTagName('body');
When I echo
echo $dom->textContent;
The Output becomes
§Ø³ÙبÙÙ ÙÙÚº غرÙب ک٠آÙÛ ÙÛÙ
How Can I get the proper unicode characters back using PHP DOMDocument.
This worked for me:
$html_unicode = "xyz unicode data";
$html_data = '<body>'.$html_unicode . '</body>';
$dom = new DOMDocument();
$html_data = mb_convert_encoding($html_data , 'HTML-ENTITIES', 'UTF-8'); // require mb_string
$dom->loadHTML($html_data);
$elements = $dom->getElementsByTagName('body');
Try this :)
<?php
$html_unicode = "xyz unicode data";
$html_data = '<body>'.$html_unicode . '</body>';
$dom = new DOMDocument();
$dom->loadHTML($html_data );
$elements = $dom->getElementsByTagName('body');
echo utf8_decode($dom->textContent);
?>
Thank God I got the Solution By Just Replacing
$html_data = '<body>'.$html_unicode . '</body>';
with
$html_data = '<head><meta http-equiv="Content-Type"
content="text/html; charset=utf-8">
</head><body>' . $html_unicode . '</body>';
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With