Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UTF-8 problems while reading CSV file with fgetcsv

I try to read a CSV and echo the content. But the content displays the characters wrong.

Mäx Müstermänn -> Mäx Müstermänn

Encoding of the CSV file is UTF-8 without BOM (checked with Notepad++).

This is the content of the CSV file:

"Mäx";"Müstermänn"

My PHP script

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> </head> <body> <?php $handle = fopen ("specialchars.csv","r"); echo '<table border="1"><tr><td>First name</td><td>Last name</td></tr><tr>'; while ($data = fgetcsv ($handle, 1000, ";")) {         $num = count ($data);         for ($c=0; $c < $num; $c++) {             // output data             echo "<td>$data[$c]</td>";         }         echo "</tr><tr>"; } ?> </body> </html> 

I tried to use setlocale(LC_ALL, 'de_DE.utf8'); as suggested here without success. The content is still wrong displayed.

What I'm missing?

Edit:

An echo mb_detect_encoding($data[$c],'UTF-8'); gives me UTF-8 UTF-8.

echo file_get_contents("specialchars.csv"); gives me "Mäx";"Müstermänn".

And

print_r(str_getcsv(reset(explode("\n", file_get_contents("specialchars.csv"))), ';')) 

gives me

Array ( [0] => Mäx [1] => Müstermänn )

What does it mean?

like image 969
testing Avatar asked Jan 16 '12 15:01

testing


People also ask

What is the best encoding for CSV?

The CSV file must be saved with UTF-8 or RFC-4180 encoding for special and multi-byte characters to import correctly. You can use utilities, such as Notepad++ to save the file in UTF-8 format.


2 Answers

Try this:

<?php $handle = fopen ("specialchars.csv","r"); echo '<table border="1"><tr><td>First name</td><td>Last name</td></tr><tr>'; while ($data = fgetcsv ($handle, 1000, ";")) {         $data = array_map("utf8_encode", $data); //added         $num = count ($data);         for ($c=0; $c < $num; $c++) {             // output data             echo "<td>$data[$c]</td>";         }         echo "</tr><tr>"; } ?> 
like image 85
robsonsanches Avatar answered Sep 29 '22 12:09

robsonsanches


Encountered similar problem: parsing CSV file with special characters like é, è, ö etc ...

The following worked fine for me:

To represent the characters correctly on the html page, the header was needed :

header('Content-Type: text/html; charset=UTF-8'); 

In order to parse every character correctly, I used:

utf8_encode(fgets($file)); 

Dont forget to use in all following string operations the 'Multibyte String Functions', like:

mb_strtolower($value, 'UTF-8'); 
like image 28
user2992220 Avatar answered Sep 29 '22 11:09

user2992220