Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to read this xml, get "parser error : CData section not finished"

im trying to read this xml: xml rss file

but with not success.. have this error

    Warning: simplexml_load_file(): http://noticias.perfil.com/feed/:232: parser error : CData section not finished <p>La sola lectura de los datos estadísticos desp in D:\xampp\FerreWoo\scrap-rvnot.php on line 43

    Warning: simplexml_load_file(): Isis, con lo que habría logrado un nuevo respaldo a sus proyectos terroristas. in D:\xampp\FerreWoo\scrap-rvnot.php on line 43

    Warning: simplexml_load_file(): ^ in D:\xampp\FerreWoo\scrap-rvnot.php on line 43

Im using this code:

   $feed = simplexml_load_file($urls, null, LIBXML_NOCDATA);

I try cURL too but the same erros still comming.

I know that maybe de xml file is incorrect... but there must be a way to read it, right?

like image 761
Targaryen Avatar asked Mar 09 '23 23:03

Targaryen


1 Answers

You have some invalid characters on that XML. Try this code below

$url    = 'http://noticias.perfil.com/feed/';
$html   = file_get_contents($url);
$invalid_characters = '/[^\x9\xa\x20-\xD7FF\xE000-\xFFFD]/';
$html = preg_replace($invalid_characters, '', $html);

$xml = simplexml_load_string($html);

//test purpose part 
$encode = json_encode($xml);
$decode = json_decode($encode, true);
print_r($decode);

Hope it helps

like image 115
rheeantz Avatar answered Apr 06 '23 23:04

rheeantz