Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Opening CSV with UTF-8 BOM via Excel

I create csv file with data by the means of java. And I faced the following well-known issue: the letters in Portuguese were displayed by the wrong way in Excel (when opening by double click).

I solved this by UTF-16LE+BOM, but excel started to recognize tabs as columns separators instead of commas.

So I looked up for another solution and saw many posts, in which people say that just adding UTF-8 BOM and writing file in UTF-8 will do the job for Excel 2007 and later. I tried the simpliest sample on my work computer and it failed. But when I tried this at my home computer it worked like a charm.

Both computers have the same versions of java installed and operating system Windows 7. I am confused. Can anyone tell what can cause such a strange behaviour?

You can see my simpliest sample below:

String filename = "D:/check/test_with_bom.csv";
        FileOutputStream fos = new FileOutputStream(filename);
        byte[] bom = new byte[] { (byte)0xEF, (byte)0xBB, (byte)0xBF }; 
        fos.write(bom);
        OutputStreamWriter osw = new OutputStreamWriter(fos , "UTF-8");
        PrintWriter printWriter = new PrintWriter(osw);

        printWriter.print("Hello,Olá,ão,ção");
        printWriter.close();
like image 819
me1111 Avatar asked Nov 28 '13 22:11

me1111


People also ask

How do I save a CSV file in UTF-8 BOM?

Open your CSV file with any text editor that supports both BOM and NON-BOM. Save it again without BOM (for example, in Notepad++ , select Encoding | Encode in UTF-8 and save the file).

How do I automatically open a CSV file in Excel with columns?

To open a comma-delimited (CSV) file properly, use Excel's Data Import from Text feature to open the import wizard and set all columns as text.


1 Answers

You should be aware that Excel does not "open" csv files. It converts them to an Excel file on the fly, using defaults. These defaults can be different depending on your regional settings. Because of that, it's never a good idea to let Excel open csv files using the defaults, since you'll never know for sure what you end up with.

A safer method is to use the 'import from text' method, and explicitly specify the delimiter, encoding, etc... Yet, be aware that 'save as csv' in Excel is an even worse idea, since it does not allow you to specify the encoding, delimiter, or any other detail. Access does.

On American Windows versions of Excel, the default column separator is a comma. On European Windows versions the comma is reserved for the Decimal Symbol and to avoid conflicts, a semicolon is used by default as column separator.

If you -really- -really- -have- to use CSV, you can consider adding the "sep=," indicator at the top of your csv file. yet, be aware that this will probably cause problems in other applications.

like image 53
Wouter Avatar answered Sep 24 '22 09:09

Wouter