Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby: How to generate CSV files that has Excel-friendly encoding

Tags:

I am generating CSV files that needs to be opened and reviewed in Excel once they have been generated. It seems that Excel requires a different encoding than UTF-8.

Here is my config and generation code:

csv_config = {col_sep: ";",                row_sep: "\n",                encoding: Encoding::UTF_8              }  csv_string = CSV.generate(csv_config) do |csv|   csv << ["Text a", "Text b", "Text æ", "Text ø", "Text å"] end 

When opening this in Excel, the special characters are not being displayed properly:

Text a  Text b  Text æ Text ø Text å 

Any idea how to ensure proper encoding?

like image 478
Cjoerg Avatar asked May 21 '15 08:05

Cjoerg


People also ask

What encoding does Excel use for CSV?

From memory, Excel uses the machine-specific ANSI encoding.

What is the best encoding for CSV?

The CSV file must be saved with UTF-8 or RFC-4180 encoding for special and multi-byte characters to import correctly. You can use utilities, such as Notepad++ to save the file in UTF-8 format.


1 Answers

The top voted answer from @joaofraga worked for me, but I found an alternative solution that also worked - no UTF-8 to ISO-8859-1 transcoding required.

From what I've read, Excel, can indeed handle UTF-8, but for some reason, it doesn't recognize it by default. But if you add a BOM to the beginning of the CSV data, this seems to cause Excel to realise that the file is UTF-8.

So, if you have a CSV like so:

csv_string = CSV.generate(csv_config) do |csv|   csv << ["Text a", "Text b", "Text æ", "Text ø", "Text å"] end 

just add a BOM byte like so:

"\uFEFF" + csv_string 

In my case, my controller is sending the CSV as a file, so this is what my controller looks like:

def show   respond_to do |format|     format.csv do       #  add BOM to force Excel to realise this file is encoded in UTF-8, so it respects special characters       send_data "\uFEFF" + csv_string, type: :csv, filename: "csv.csv"     end   end end 

I should note that UTF-8 itself does not require or recommend a BOM at all, but as I mentioned, adding it in this case seemed to nudge Excel into realising that the file was indeed UTF-8.

like image 140
deepmotion Avatar answered Oct 23 '22 13:10

deepmotion