Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Detect file encoding in PHP

I have a script which combines a number of files into one, and it breaks when one of the files has UTF8 encoding. I figure that I should be using the utf8_decode() function when reading the files, but I don't know how to tell which need decoding.

My code is basically:

$output = ''; foreach ($files as $filename) {     $output .= file_get_contents($filename) . "\n"; } file_put_contents('combined.txt', $output); 

Currently, at the start of a UTF8 file, it adds these characters in the output: 

like image 542
nickf Avatar asked Feb 03 '09 00:02

nickf


1 Answers

Try using the mb_detect_encoding function. This function will examine your string and attempt to "guess" what its encoding is. You can then convert it as desired. As brulak suggested, however, you're probably better off converting to UTF-8 rather than from, to preserve the data you're transmitting.

like image 70
Ben Blank Avatar answered Oct 10 '22 02:10

Ben Blank