Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP: UTF 8 characters encoding

I am scraping a list of RSS feeds by using cURL, and then I am reading and parsing the RSS data with SimpleXML. The sorted data is then inserted into a mySQL database.

However, as notice on http://dansays.co.uk/research/MNA/rss.php I am having several issues with characters not displaying correctly.

Examples:

âGuitar Hero: Van Halenâ Trailer And Tracklist Available

NV 10/10/09 – Salt Lake City, UT 10/11/09 – Denver, CO 10/13/09 –

I have tried using htmlentities and htmlspecialchars on the data before inserting them into the database, but it doesn't seem to help resolve issue.

How could I possibly resolve this issue I am having?

Thanks for any advices.

Updated

I've tried what Greg suggested, and the issue is still here...

Here is the code I used to do SET NAMES in PDO:

$dbh = new PDO($dbstring, $username, $password); 

$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); 

$dbh->query('SET NAMES "utf8"');

I did a bit of echo'ing with the simplexml data before it is sorted and inserted into the database, and I now believe it is something to do with the cURL...

Here is what I have for cURL:

$ch = curl_init($url);

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0);

curl_setopt($ch, CURLOPT_HEADER, 0);

curl_setopt($ch, CURLOPT_ENCODING, 'UTF-8');

$data = curl_exec($ch);

curl_close($ch);

$doc = new SimpleXmlElement($data, LIBXML_NOCDATA);

Issue Resolved

I had to set the content charset in the RSS/HTML page to "UTF-8" to resolve this issue. I guess this isn't a real fix as the char problems are still there in the raw data. Looking forward to proper support for it in PHP6!

like image 876
Daniel Clark Avatar asked Jan 23 '23 08:01

Daniel Clark


1 Answers

Your page is being served as UTF-8 so I'd point my finger at the database.

Make sure the connection is in UTF-8 before any SELECTs or INSERTS - in MySQL:

SET NAMES "utf8"
like image 93
Greg Avatar answered Jan 25 '23 20:01

Greg