Encoding Error in Panda read_csv [duplicate]

Tags:

I'm attempting to read a CSV file into a Dataframe in Pandas. When I try to do that, I get the following error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 55: invalid start byte

This is from code:

import pandas as pd  location = r"C:\Users\khtad\Documents\test.csv"  df = pd.read_csv(location, header=0, quotechar='"')

This is on a Windows 7 Enterprise Service Pack 1 machine and it seems to apply to every CSV file I create. In this particular case the binary from location 55 is 00101001 and location 54 is 01110011, if that matters.

Saving the file as UTF-8 with a text editor doesn't seem to help, either. Similarly, adding the param "encoding='utf-8' doesn't work, either--it returns the same error.

What is the most likely cause of this error and are there any workarounds other than abandoning the DataFrame construct for the moment and using the csv module to read in the CSV line-by-line?

583

asked May 26 '15 15:05

khtad

1 Answers

Try calling read_csv with encoding='latin1', encoding='iso-8859-1' or encoding='cp1252' (these are some of the various encodings found on Windows).

139

answered Oct 02 '22 05:10

maxymoo

Related questions
                            
                                Delete every other line in notepad++
                            
                                Splitting a csv file with quotes as text-delimiter using String.split()
                            
                                PG COPY error: invalid input syntax for integer
                            
                                Convert commas decimal separators to dots within a Dataframe
                            
                                Read csv from Google Cloud storage to pandas dataframe
                            
                                How to upload and parse a CSV file in php
                            
                                Export CSV without col.names
                            
                                How to export a Hive table into a CSV file?
                            
                                reading and parsing a TSV file, then manipulating it for saving as CSV (*efficiently*)
                            
                                Remove index column while saving csv in pandas
                            
                                Parse CSV file with header fields as attributes for each row
                            
                                Create mysql table directly from CSV file using the CSV Storage engine?
                            
                                How do you replace all the occurrences of a certain character in a string?
                            
                                How do I skip a header from CSV files in Spark?
                            
                                How to keep leading zeros in a column when reading CSV with Pandas?
                            
                                Change output format for MySQL command line results to CSV
                            
                                Find all CSV files in a directory using Python
                            
                                How to check encoding of a CSV file
                            
                                duplicate 'row.names' are not allowed error
                            
                                Read and Write CSV files including unicode with Python 2.7

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Encoding Error in Panda read_csv [duplicate]

Tags:

pandas

csv

utf-8

khtad

People also ask

1 Answers

maxymoo

Recent Activity

Donate For Us