Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading Arabic from JSON file

Tags:

python

json

I want to read JSON file in Python that contains Arabic text but the Arabic text is appear like that:

ط§ظ„ط³ظژط¹ظژط§ط¯ظژط©ظگ ظ„ظژظٹظگط³ظژطھظŒ ط§ظ„ط­ظژطµظŒظˆظژظ„ظژ ط¹ظژظ„ظ‰ظژ 
ظ…ط§ظژ ظ„ط§ظ†ظژظ…ظ„ظگظƒظژ ط¨ظژظ„ ظ‡ظگظٹظژ ط£ظ†ظژ ظ†ظژظپظ‡ظŒظ…ظژ 
ظˆظژظ†ظگط¯ط±ظژظƒظژ ظ‚ظژظٹظگظ…ط©ظڈ ظ…ظژط§ظ†ظژظ…ظ„ظƒ 

How can I read the correct Arabic letters?

import sys
non_bmp_map = dict.fromkeys(range(0x10000, sys.maxunicode + 1), 0xfffd)
print(x.translate(non_bmp_map))

x is parameter that contains Arabic value from JSON file. I expected to get this sentence :السَعَادَةِ لَيِسَتٌ الحَصٌوَلَ عَلىَ ماَ لانَملِكَ بَل هِيَ أنَ نَفهٌمَ وَنِدرَكَ قَيِمةُ مَانَملك but I get ط§ظ„ط³ظژط¹ظژط§ط¯ظژط©ظگ ظ„ظژظٹظگط³ظژطھظŒ ط§ظ„ط­ظژطµظŒظˆظژظ„ظژ ط¹ظژظ„ظ‰ظژ ظ…ط§ظژ ظ„ط§ظ†ظژظ…ظ„ظگظƒظژ ط¨ظژظ„ ظ‡ظگظٹظژ ط£ظ†ظژ ظ†ظژظپظ‡ظŒظ…ظژ ظˆظژظ†ظگط¯ط±ظژظƒظژ ظ‚ظژظٹظگظ…ط©ظڈ ظ…ظژط§ظ†ظژظ…ظ„ظƒ

like image 674
فاطمه الطاهر Avatar asked Oct 17 '22 20:10

فاطمه الطاهر


1 Answers

You haven't mentioned if you're using Python 3 or 2. In Python 3, the strings are unicode, by default.

If you use Python 2, use codec:

import codecs
f = codecs.open('unicode.rst', encoding='utf-8')
for line in f:
    print repr(line)

Ref: Unicode How-to


It is possible, however, that your input data isn't correctly encoded. In that case, you can try using ftfy package.

ftfy implements several heuristics to fix broken/inconsistent unicode encodings. From the docs:

>>> from ftfy import fix_encoding
>>> print(fix_encoding("(ง'⌣')ง"))
(ง'⌣')ง
like image 101
Prashant Sinha Avatar answered Oct 20 '22 10:10

Prashant Sinha