I have a very annoying encoding problem using opencsv. When I export a csv file, I set character type as 'UTF-8'.
CSVWriter writer = new CSVWriter(new OutputStreamWriter("D:/test.csv", "UTF-8"));
but when I open the csv file with Microsoft Office Excel 2007, it turns out that it has 'UTF-8 BOM' encoding?
Once I save the file in Notepad and re-open, the file turns back to UTF-8 and all the letters in it appears fine. I think I've searched enough, but I haven't found any solution to prevent my file from turning into 'UTF-8 BOM'. any ideas, please?
I suppose your file has a 'UTF-8 without BOM' encoding. You better feed BOM encoding to your file, even though it's not necessary in most cases, but only one obvious exception is when you deal with ms excel.
FileOutputStream os = new FileOutputStream(file);
os.write(0xef);
os.write(0xbb);
os.write(0xbf);
CSVWriter csvWrite = new CSVWriter(new OutputStreamWriter(os));
Now your file will be understood by excel as utf-8 csv.
UTF-8
and UTF-8 Signature
(which incorrectly named sometimes as UTF-8 BOM
) are same encodings, and signature is used only to distinguish it from any other encodings. Any unicode application should process UTF-8 signature (which is three bytes sequence EF BB BF
) correctly.
Why Java is specifically adds this signature and how to stop it doing that I don't know.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With