Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get source code with Chinese characters PHP

Well, I give up. I've been messing around with all I could think of to retrieve data from a target website that has information in traditional Chinese encoding (charset=GB2312).

I've been using the simple_html_parser like always but it doesn't seem to return the Chinese characters, in fact all I get are some weird question marks embedded inside a rhomboid shape. ("�������ѯ�ؼ��֣�" Like so)

Declaring the encoding for the php file didn't do anything except of getting rid of some unwanted character showing at the start of the page.

By declaring it I mean:

header('Content-Type', 'text/html; charset=GB2312');

I can't get any data that's written in Chinese, also tried file_get_contents with the same luck. I'm probably missing something obvious since I can't find any related discussion elsewhere.

Thanks in advance.

like image 278
johnnyArt Avatar asked Jun 17 '26 08:06

johnnyArt


1 Answers

Have you tried converting the encoding with mb_convert_encoding or iconv, e.g.

$str = mb_convert_encoding($content, 'UTF-8', 'GB2312');

or

$str = iconv("UTF-8", "GB2312//IGNORE", $content);
like image 87
Gordon Avatar answered Jun 18 '26 23:06

Gordon



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!