Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting ANSI to UTF-8 in shell

I'm making a parser (1 csv to 3 csv) script and I have a problem. I am French so in my language I have letters like: é è à ....

A customer sent me a csv file that Linux recognizes as "unknown-8bit" (ansi I guess).

In my script, I'm writing 3 new csv files. But ViM creates them as ISO latin1 because it's close to what it got in the entry, but my é,è,à... are broken. I need UTF-8.

So I tried to convert the first ANSI csv to UTF-8 :

iconv -f "windows-1252" -t "UTF-8" import.csv -o import.csv

The problem is that it breaks my CSV. It's now on only one row. But my special chars are ok. Is there a way to convert ANSI to UTF-8 and keeping my rows?

like image 596
Neringan Avatar asked Nov 28 '13 10:11

Neringan


People also ask

How do I change ANSI TO UTF-8?

3. Choose "UTF-8" from the drop-down box next to "Encoding" and click "Save." Your text file will be converted and saved in the UTF-8 format, although the file extension will remain the same. You can now able open and edit the document at any time and your special characters will be preserved.

How do I create a UTF-8 file in Linux?

If memory serves, just type ":set fileencoding=utf8" in vim and then save the file. (These days you might even make that a default.) Note that this is completely unrelated to the locale/encoding your shell uses, and also the encoding that vim uses internally while the file is in RAM.


1 Answers

Put the output into another file. Don't overwrite the old one.

iconv -f "windows-1252" -t "UTF-8" import.csv -o new_import.csv

iconv fails when reading and writing to the same file.

like image 89
Grzegorz Żur Avatar answered Oct 21 '22 09:10

Grzegorz Żur