Convert CSV to UTF-8 in Python

Question

I am trying to create a duplicate CSV without a header. When I attempt this I get the following error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 1895: invalid start byte.

I've read the python CSV documentation on Unicode and UTF-8 encoding and have implemented it. However, my output file is being generated with no data in it. Not sure what I am doing wrong here.

import csv

path =  '/Users/johndoe/file.csv'

with open(path, 'r') as infile, open(path + 'final.csv', 'w') as outfile:

    def unicode_csv(infile, outfile):
        inputs = csv.reader(utf_8_encoder(infile))
        output = csv.writer(outfile)

        for index, row in enumerate(inputs):
            yield [unicode(cell, 'utf-8') for cell in row]
            if index == 0:
                 continue
        output.writerow(row)

    def utf_8_encoder(infile):
        for line in infile:
            yield line.encode('utf-8')

unicode_csv(infile, outfile)

user3062459 · Accepted Answer

The solution was to simply include two additional parameters to the

with open(path, 'r') as infile:

The two parameters are encoding ='UTF-8' and errors='ignore'. This allowed me to create a duplicate of original CSV without the headers and without the UnicodeDecodeError. Below is the completed code.

import csv

path =  '/Users/johndoe/file.csv'

with open(path, 'r', encoding='utf-8', errors='ignore') as infile, open(path + 'final.csv', 'w') as outfile:
     inputs = csv.reader(infile)
     output = csv.writer(outfile)

     for index, row in enumerate(inputs):
         # Create file with no header
         if index == 0:
             continue
         output.writerow(row)

Convert CSV to UTF-8 in Python

Tags:

python

csv

utf-8

user3062459

1 Answers

user3062459

Recent Activity

Donate For Us

Convert CSV to UTF-8 in Python

Tags:

python

csv

utf-8

user3062459

1 Answers

user3062459

Related questions

Recent Activity

Donate For Us