Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 44: character maps to <undefined>

Using pandas on Python 3 Jupyter notebook, I got

UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 44: character maps to

error while trying to read a json file that looks like this:

{
    "Test1": {
        "A": "攻撃を続ける",
        "B": "残り資源",
        "C": "残りの資源を得るため小隊を修理し戦闘を続けろ:"
    },
    "Test2": {
        "D": "{x} 日目",
        "E": "CC レベル {x}",
        "F": "本当にこれから全てのデバイスでこの基地を使用しますか?",
        "G": "この{social_network}アカウントには2つの基地が存在してます。基地の数は一人のプレイヤーにつき一つに限定されています。基地を選択するか、キャンセルしてください。",
    }
}

Any idea how to solve this?

See full error message here

import pandas as pd

json_df = pd.read_json('input.json')
json_df

EDIT: I have also tried reading the json with the JSON module, it still the same error.

like image 547
userPyGeo Avatar asked Apr 01 '17 03:04

userPyGeo


2 Answers

In case if you are reading Text file and you get the error "python-unicodedecodeerror-charmap-codec-cant-decode-byte-0x81-in-position"

Then do this: Convert the text file to CSV.

data=open('c:/.../path/.../filename.csv',encoding='utf-8')
data=data.read().lower()
like image 52
Girish Kurup Avatar answered Oct 18 '22 20:10

Girish Kurup


Your .json file is encoded as UTF-8. pd.read_json tries to decode it as CP1252. You need to make it decode it as UTF-8:

import pandas as pd

json_df = pd.read_json('input.json', encoding='UTF-8')
json_df
like image 45
pts Avatar answered Oct 18 '22 18:10

pts