I am trying to clean up tweets to analyze their sentiments. I want to turn emojis to what they mean.
For instance, I want my code to convert
'I ❤ New York'
'Python is 👍'
to
'I love New York'
'Python is cool'
I have seen packages such as emoji
but they turn the emoji's to what they represent, not what they mean. for instance, they turn my tweets to :
print(emoji.demojize('Python is 👍'))
'Python is :thumbs_up:'
print(emoji.demojize('I ❤ New York'))
'I :heart: New York'
since "heart" or "thumbs_up" do not carry a positive or negative meaning in textblob
, this kind of conversion is useless. But if "❤" is converted to "love", the results of sentiment analysis will improve drastically.
Referring this kaggle kernel here
def convert_emojis(text):
for emot in UNICODE_EMO:
text = re.sub(r'('+emot+')', "_".join(UNICODE_EMO[emot].replace(",","").replace(":","").split()), text)
return text
text = "game is on 🔥"
convert_emojis(text)
Gives the output 'game is on fire'
. You can find a dictionary mapping from emojis to words here.
Hope this helps
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With