Python read csv with Hebrew header

Question

I tried to use dataset=pandas.read_csv('filename') to make a framework. But somehow I can't do it because one of the column headers is written in Hebrew.

I checked, and it is possible for a DataFrame to have a Hebrew word as column header. dataset.columns = ['שלום', 'b','c','d','e'] but I want to import the data itself from the csv containing the Hebrew word, which I can't.

I get this error UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf9 in position 0: invalid start byte.

How can I import a dataset to datadrame with the column header?

user1875037 · Accepted Answer

I used:

dataset = pd.read_csv('file_name.csv', encoding = "ISO-8859-8")

see https://docs.python.org/3/library/codecs.html#standard-encodings for encodings

Danny_ds · Answer

Your file is not in utf-8 encoding.

Most likely in ASCII with Hebrew codepage.

0xf9 in Hebrew codepage matches the first (last) character you show in your header example.

You'll have to use the encoding: parameter with the correct codepage.

Itamar Mushkin · Answer

As for how to check your encoding, there's a simple trick here, might be of use:

You can just open the file using notepad and then goto File -> Save As. Next to the Save button there will be an encoding drop down and the file's current encoding will be selected there.

Python read csv with Hebrew header

Tags:

python

pandas

csv

utf-8

hebrew

Matan

3 Answers

user1875037

Danny_ds

Itamar Mushkin

Recent Activity

Donate For Us

Python read csv with Hebrew header

Tags:

python

pandas

csv

utf-8

hebrew

Matan

3 Answers

user1875037

Danny_ds

Itamar Mushkin

Related questions

Recent Activity

Donate For Us