I am trying to plot an histogram of most frequent words written in arabic
, but I can't figure out a way to do that. All I can get is the sliced characters but not the compiled word.
Here is an example of what I get :
import seaborn as sns
import pandas as pd
res = {
'الذكاء': 8,
'الاصطناعي': 9,
'هو': 2,
'سلوك': 1,
'وخصائص': 1,
'معينة': 1,
'تتسم': 1
}
df = pd.DataFrame(res.items(), columns=['word', 'count'])
sns.set(style="whitegrid")
ax = sns.barplot(x="count", y="word", data=df)
As shown in the image above, I am expecting to get those characters compiled, like they're mentioned in the dictionary.
This seems to run well with arabic_reshaper
and bidi
as pointed out by @Sheldore.
import seaborn as sns
import pandas as pd
import arabic_reshaper
from bidi.algorithm import get_display
res = {
'الذكاء': 8,
'الاصطناعي': 9,
'هو': 2,
'سلوك': 1,
'وخصائص': 1,
'معينة': 1,
'تتسم': 1
}
res2 = {get_display(arabic_reshaper.reshape(k)): v for k,v in res.items()}
df = pd.DataFrame(res2.items(), columns=['word', 'count'])
sns.set(style="whitegrid")
ax = sns.barplot(x="count", y="word", data=df)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With