Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plotting histograms with Arabic characters

I am trying to plot an histogram of most frequent words written in arabic, but I can't figure out a way to do that. All I can get is the sliced characters but not the compiled word.

Here is an example of what I get :

enter image description here

import seaborn as sns

import pandas as pd

res = {
 'الذكاء': 8,
 'الاصطناعي': 9,
 'هو': 2,
 'سلوك': 1,
 'وخصائص': 1,
 'معينة': 1,
 'تتسم': 1
}

df = pd.DataFrame(res.items(), columns=['word', 'count'])

sns.set(style="whitegrid")
ax = sns.barplot(x="count", y="word", data=df)

As shown in the image above, I am expecting to get those characters compiled, like they're mentioned in the dictionary.

like image 875
saul Avatar asked May 24 '19 12:05

saul


1 Answers

This seems to run well with arabic_reshaper and bidi as pointed out by @Sheldore.

import seaborn as sns
import pandas as pd
import arabic_reshaper
from bidi.algorithm import get_display

res = {
 'الذكاء': 8,
 'الاصطناعي': 9,
 'هو': 2,
 'سلوك': 1,
 'وخصائص': 1,
 'معينة': 1,
 'تتسم': 1
}

res2 = {get_display(arabic_reshaper.reshape(k)): v for k,v in res.items()}

df = pd.DataFrame(res2.items(), columns=['word', 'count'])

sns.set(style="whitegrid")
ax = sns.barplot(x="count", y="word", data=df)

barplot with arabic labels

like image 200
mozway Avatar answered Nov 11 '22 17:11

mozway