I'm trying to import a CSV, using this code: <pre class="prettyprint"><code> import csv import sys def load_csv(filename): # Open file for reading file = open(filename, 'r') # Read in file return csv.reader(file, delimiter=',', quotechar='\n') def main(argv): csv_file = load_csv("myfile.csv") for item in csv_file: print(item) if __name__ == "__main__": main(sys.argv[1:]) </code></pre> Here's a sample of my csv file: <pre class="prettyprint"><code> foo,bar,test,1,2 this,wont,work,because,α </code></pre> And the error: <pre class="prettyprint"><code> Traceback (most recent call last): File "test.py", line 22, in <module> main(sys.argv[1:]) File "test.py", line 18, in main for item in csv_file: File "/usr/lib/python3.2/encodings/ascii.py", line 26, in decode return codecs.ascii_decode(input, self.errors)[0] UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 40: ordinal not in range(128) </code></pre> Obviously, It's hitting the character at the end of the CSV and throwing that error, but I'm at a loss as to how to fix this. Any help? This is: <pre class="prettyprint"><code> Python 3.2.3 (default, Apr 23 2012, 23:35:30) [GCC 4.7.0 20120414 (prerelease)] on linux2 </code></pre>

From the python docs, you have to set the encoding for the file. Here is an example from the site: <pre class="prettyprint"><code>import csv with open('some.csv', newline='', encoding='utf-8') as f: reader = csv.reader(f) for row in reader: print(row) </code></pre> Edit: Your problem appears to happen with printing. Try using pretty printer: <pre class="prettyprint"><code>import csv import pprint with open('some.csv', newline='', encoding='utf-8') as f: reader = csv.reader(f) for row in reader: pprint.pprint(row) </code></pre>

UnicodeDecodeError in Python 3 when importing a CSV file

Tags:

python

python-3.x

csv

unicode

non-ascii-characters

I'm trying to import a CSV, using this code:

    import csv
    import sys

    def load_csv(filename):
        # Open file for reading
        file = open(filename, 'r')

        # Read in file
        return csv.reader(file, delimiter=',', quotechar='\n')

    def main(argv):
        csv_file = load_csv("myfile.csv")

        for item in csv_file:
            print(item)

    if __name__ == "__main__":
        main(sys.argv[1:])

Here's a sample of my csv file:

    foo,bar,test,1,2
    this,wont,work,because,α

And the error:

    Traceback (most recent call last):
      File "test.py", line 22, in <module>
        main(sys.argv[1:])
      File "test.py", line 18, in main
        for item in csv_file:
      File "/usr/lib/python3.2/encodings/ascii.py", line 26, in decode
        return codecs.ascii_decode(input, self.errors)[0]
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 40: ordinal not in range(128)

Obviously, It's hitting the character at the end of the CSV and throwing that error, but I'm at a loss as to how to fix this. Any help?

This is:

    Python 3.2.3 (default, Apr 23 2012, 23:35:30)
    [GCC 4.7.0 20120414 (prerelease)] on linux2

786

asked Oct 05 '12 18:10

Ryan Rapini

3 Answers

It seems your problem boils down to:

print("α")

You could fix it by specifying PYTHONIOENCODING:

$ PYTHONIOENCODING=utf-8 python3 test.py > output.txt

Note:

$ python3 test.py

should work as is if your terminal configuration supports it, where test.py:

import csv

with open('myfile.csv', newline='', encoding='utf-8') as file:
    for row in csv.reader(file):
        print(row)

If open() has no encoding parameter above then you'll get UnicodeDecodeError with LC_ALL=C.

Also with LC_ALL=C you'll get UnicodeEncodeError even if there is no redirection i.e., PYTHONIOENCODING is necessary in this case (before PEP 538: Legacy C Locale Coercion implemented in Python 3.7+).

179

answered Sep 18 '22 15:09

jfs

From the python docs, you have to set the encoding for the file. Here is an example from the site:

import csv

 with open('some.csv', newline='', encoding='utf-8') as f:
   reader = csv.reader(f)
   for row in reader:
     print(row)

Edit: Your problem appears to happen with printing. Try using pretty printer:

import csv
import pprint

with open('some.csv', newline='', encoding='utf-8') as f:
  reader = csv.reader(f)
  for row in reader:
    pprint.pprint(row)

answered Sep 18 '22 15:09

TheDude

Another option is to cover up the errors by passing an error handler:

with open('some.csv', newline='', errors='replace') as f:
   reader = csv.reader(f)
   for row in reader:
    print(row)

which will replace any undecodable bytes in the file with a "missing character".

answered Sep 20 '22 15:09

Ayush Abhijeet

Related questions
                            
                                Dataframe filtering rows by column values
                            
                                Tensorflow, Variable W3 already exists, disallowed
                            
                                NameError: name 'json' is not defined
                            
                                brew-installed Python not overriding system python
                            
                                How does the logical `and` operator work with integers? [duplicate]
                            
                                filter pandas dataframe on one level of a multi level index
                            
                                Pandas: Get all columns that have constant value
                            
                                If I have the contents of a zipfile in a Python string, can I decompress it without writing it to a file?
                            
                                Is there a way to make python become interactive in the middle of a script?
                            
                                Why is i++++++++i valid in python?
                            
                                csrf error in django
                            
                                need the average from a list of timedelta objects
                            
                                How can I close an image shown to the user with the Python Imaging Library?
                            
                                Saving a Numpy array as an image (instructions)
                            
                                Python FTP get the most recent file by date
                            
                                How can I flip an image along the vertical axis with python? [closed]
                            
                                Python combine two for loops
                            
                                ordering shuffled points that can be joined to form a polygon (in python)
                            
                                python regex findall and multiline
                            
                                from . import * from module

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With