i get page in utf-8 with russian language using curl. if i echo text it show good. then i use such code
$dom = new domDocument;
/*** load the html into the object ***/
@$dom->loadHTML($html);
/*** discard white space ***/
$dom->preserveWhiteSpace = false;
/*** the table by its tag name ***/
$tables = $dom->getElementsByTagName('table');
/*** get all rows from the table ***/
$rows = $tables->item(0)->getElementsByTagName('tr');
/*** loop over the table rows ***/
for ($i = 0; $i <= 5; $i++)
{
/*** get each column by tag name ***/
$cols = $rows->item($i)->getElementsByTagName('td');
echo $cols->item(2)->nodeValue;
echo '<hr />';
}
$html contains russian text. after it line echo $cols->item(2)->nodeValue; display error text, not russian. i try iconv but not work. any ideas?
I suggest use mb_convert_encoding before load UTF-8 page.
$dom = new DomDocument(); $html = mb_convert_encoding($html, 'HTML-ENTITIES', "UTF-8"); @$dom->loadHTML($html);
OR else you could try this
$dom = new DomDocument('1.0', 'UTF-8'); @$dom->loadHTML($html); $dom->preserveWhiteSpace = false; .......... echo html_entity_decode($cols->item(2)->nodeValue,ENT_QUOTES,"UTF-8"); ..........
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With