Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 15: invalid start byte

import csv
import pandas as pd
db = input("Enter the dataset name:")
table = db+".csv"
df = pd.read_csv(table)
df = df.sample(frac=1).reset_index(drop=True)
with open(table,'rb') as f:
    data = csv.reader(f)
    for row in data:
        rows = row
        break
print(rows)

I am trying to read all the columns from the csv file.

ERROR: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 15: invalid start byte

like image 425
harsha vardhan Avatar asked Sep 14 '25 08:09

harsha vardhan


1 Answers

You need to check encoding of your csv file.

For that you can use print(f),

with open('file_name.csv') as f:
    print(f)

The output will be:

<_io.TextIOWrapper name='file_name.csv' mode='r' encoding='utf8'>

Open csv with the encoding as mentioned in the above output,

with open(fname, "rt", encoding="utf8") as f:

As mentioned in comments, your encoding is cp1252

so,

with open(fname, "rt", encoding="cp1252") as f:
    ...

and for .read_csv,

df = pd.read_csv(table, encoding='cp1252')
like image 172
shaik moeed Avatar answered Sep 16 '25 09:09

shaik moeed



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!