Writing/Reading special characters from CSV (Python 3.6)

Question

Let's assume that I need to write and then read a list of strings with polish words in a .csv in Python 3.6:

lista=['szczęśliwy','jabłko','słoń','kot']

Since it's not possible to write Unicode characters in the .csv, I encode the strings to utf-8, so data is saved like this in the file (all inside the first .csv cell):

b'szcz\xc4\x99\xc5\x9bliwy',b'jab\xc5\x82ko',b's\xc5\x82o\xc5\x84',b'kot'

But I am not able to decode the data from the output.csv file using this code:

with open('output.csv') as csvarchive:
    entrada = csv.reader(csvarchive)
    for reg in entrada:
        lista2=reg

print(lista2)
["b'szcz\xc4\x99\xc5\x9bliwy'", "b'jab\xc5\x82ko'", "b's\xc5\x82o\xc5\x84'", "b'kot'"]

lista2 is still a list of strings but with the utf-8 codification and I am not able to recover the special characters.

I tried several things like reading the file in 'rb' mode, encoding and decoding again... But since I am new in these matters I didn't make it. It must have very easy solution.

Tomalak · Accepted Answer

Never open text files without specifying an encoding (this is generally true).
Always open CSV files with newline='' (this applies to the Python csv module)

So, assuming your CSV file is UTF-8-encoded, use:

with open('output.csv', 'r', encoding='UTF-8', newline='') as csvarchive:
    entrada = csv.reader(csvarchive)
    for reg in entrada:
        # do something with the data row, it's already decoded

The same applies to writing the file:

with open('output.csv', 'w', encoding='UTF-8', newline='') as csvarchive:
    writer = csv.writer(csvarchive)
    # write data to the writer, it will be encoded automatically

There is no need to do any manual string encoding. Write string values to the csv writer, file encoding will happen transparently.

Writing/Reading special characters from CSV (Python 3.6)

Tags:

python

csv

unicode

utf-8

decode

Pacullamen

1 Answers

Tomalak

Recent Activity

Donate For Us

Writing/Reading special characters from CSV (Python 3.6)

Tags:

python

csv

unicode

utf-8

decode

Pacullamen

1 Answers

Tomalak

Related questions

Recent Activity

Donate For Us