Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Excel files with byte order mark for UTF-8 causing errors

I am just wondering if there is any possible way to fix this. I receive Excel files at work that contain the splicing information for fibre joints, and when I try to import then I get an output error saying the first table didn't match up with what it expected to see. When I then open it, I can see the BOM for UTF-8 as  directly before the word like this Joint Name.

Obviously this is hidden in the original file, and it causes a bit of manual work having to remove each of the symbols from each Excel file. Currently the files I am receiving are in a .csv format and the error files come back as a .csv with the symbols showing.

Is there any way I can pre-save these to remove the BOM or something to avoid the manual work in this situation?

Ideally I'd like to remove the hidden BOM so that the import just works the first time without rework.

like image 462
Mathew Hood Avatar asked Nov 03 '15 22:11

Mathew Hood


People also ask

Does UTF-8 require a byte order mark?

The byte-order mark indicates which order is used, so that applications can immediately decode the content. In the UTF-8 encoding, the presence of the BOM is not essential because, unlike the UTF-16 encodings, there is no alternative sequence of bytes in a character.

How do I enable UTF-8 in Excel?

Click Tools, then select Web options. Go to the Encoding tab. In the dropdown for Save this document as: choose Unicode (UTF-8). Click Ok.


1 Answers

I have figured out the answer, will leave it here for anyone who is encountering a similar issue.

Open the csv file in Notepad++, in the encoding menu you can change from UTF-8 BOM to UTF-8. Save the file and it is gone, simple as that.

Good luck!

like image 158
Mathew Hood Avatar answered Oct 05 '22 02:10

Mathew Hood