Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do i input Arabic text into my python code?

my project is to identify a sentiment either positive or negative ( sentiment analysis ) in Arabic language,to do this task I used NLTK and python, when I enter tweets in arabic an error occurs

>>> pos_tweets = [(' أساند كل عون أمن شريف', 'positive'),
              ('ما أحلى الثورة التونسية', 'positive'),
              ('أجمل طفل في العالم', 'positive'),
              ('الشعب يحرس', 'positive'),
              ('ثورة شعبنا هي ثورة الكـــرامة وثـــورة الأحــــرار', 'positive')]
Unsupported characters in input

how can I solve this problem?

like image 508
Manel Ayadi Avatar asked Mar 04 '13 07:03

Manel Ayadi


2 Answers

Your problem came from the IDLE shell. AFAIK IDLE won't accept UTF-8 input in interactive mode.

I suggest youe use alternative (and better) shells such as DreamPie or PythonWin.

like image 116
pram Avatar answered Sep 30 '22 00:09

pram


There is a simple hack that i usually do to input UTF-8 into my python code. I don't know why it works but it accepts the unicode strings and runs the script smoothly after I add these lines:

#! /usr/local/bin/python  -*- coding: UTF-8 -*-

pos_tweets = [(u' أساند كل عون أمن شريف', 'positive'), 
(u'ما أحلى الثورة التونسية', 'positive'), 
(u'أجمل طفل في العالم', 'positive'), 
(u'الشعب يحرس', 'positive'), 
(u'ثورة شعبنا هي ثورة الكـــرامة وثـــورة الأحــــرار', 'positive')] 

for i in pos_tweets:
  print i[0], i[1]
like image 22
alvas Avatar answered Sep 30 '22 02:09

alvas