Well, I give up. I've been messing around with all I could think of to retrieve data from a target website that has information in traditional Chinese encoding (charset=GB2312).
I've been using the simple_html_parser like always but it doesn't seem to return the Chinese characters, in fact all I get are some weird question marks embedded inside a rhomboid shape. ("�������ѯ�ؼ��֣�" Like so)
Declaring the encoding for the php file didn't do anything except of getting rid of some unwanted character showing at the start of the page.
By declaring it I mean:
header('Content-Type', 'text/html; charset=GB2312');
I can't get any data that's written in Chinese, also tried file_get_contents with the same luck. I'm probably missing something obvious since I can't find any related discussion elsewhere.
Thanks in advance.
Have you tried converting the encoding with mb_convert_encoding or iconv, e.g.
$str = mb_convert_encoding($content, 'UTF-8', 'GB2312');
or
$str = iconv("UTF-8", "GB2312//IGNORE", $content);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With