The app basically works like this:
1) The user uploads a CSV file.
2) The file is catched by PHP via POST.
3) I open the file with fopen()
and read the file with fgetcsv()
.
The first column it always have the \ufeff
char. I know that is called UTF-8 BOM, and it's generated by Microsoft Excel. But, when I want to remove that, I can't.
I've tried: str_replace('\ufeff', '', $columns[0]);
when you view the code of file using read() function you can see at the begin of the returned code '\ufeff' is shown. The one simplest solution to this problem is just by changing the encoding back to ASCII encoding(for this you can copy your code to a notepad and save it Remember!
The Unicode character U+FEFF is the byte order mark, or BOM, and is used to tell the difference between big- and little-endian UTF-16 encoding. If you decode the web page using the right codec, Python will remove it for you.
Yeah, UFEFF is the UTF8 byte order mark, which a lot of tools have trouble parsing. I'd just use standard UTF8 encoding without it for compatibility reasons. All reactions. Member.
$columns[0] = preg_replace('/[\x00-\x1F\x80-\xFF]/', '', $columns[0]);
The above code helps you remove hidden characters that exist in your document, just like the one you mentioned.
$headings=array();
$handle = fopen($_FILES["contacts_file"]["tmp_name"], "r");
$heading_data=fgetcsv($handle);
foreach($heading_data as $heading){
// Remove any invalid or hidden characters
$heading = preg_replace('/[\x00-\x1F\x80-\xFF]/', '', $heading);
array_push($headings, $heading);
}
$result = trim($result, "\xEF\xBB\xBF");
This is the simplest way to solve it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With