Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unable to add new words in Vader Lexicon using for loop. It works without the loop perfectly. How do I solve this?

Tags:

python

nltk

vader

I use vader for Sentiment Analysis. When I add a single word in addition to the Vader lexicon, it works i.e. it detects the new added word as either positive or negative based on the value I give with the word. Code is below:

from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer 
sid_obj = SentimentIntensityAnalyzer() 
new_word = {'counterfeit':-2,'Good':2,}
sid_obj.lexicon.update(new_word)
sentence = "Company Caught Counterfeit." 
sentiment_dict = sid_obj.polarity_scores(sentence) 
tokenized_sentence = nltk.word_tokenize(sentence)
pos_word_list=[]
neu_word_list=[]
neg_word_list=[]

for word in tokenized_sentence:
    if (sid_obj.polarity_scores(word)['compound']) >= 0.1:
        pos_word_list.append(word)
    elif (sid_obj.polarity_scores(word)['compound']) <= -0.1:
        neg_word_list.append(word)
    else:
        neu_word_list.append(word)                

print('Positive:',pos_word_list)
print('Neutral:',neu_word_list)
print('Negative:',neg_word_list) 

print("Overall sentiment dictionary is : ", sentiment_dict) 
print("sentence was rated as ", sentiment_dict['neg']*100, "% Negative") 
print("sentence was rated as ", sentiment_dict['neu']*100, "% Neutral") 
print("sentence was rated as ", sentiment_dict['pos']*100, "% Positive") 

print("Sentence Overall Rated As", end = " ") 

# decide sentiment as positive, negative and neutral 
if sentiment_dict['compound'] >= 0.05 : 
    print("Positive") 

elif sentiment_dict['compound'] <= - 0.05 : 
    print("Negative") 

else : 
    print("Neutral") 

The output is as follows:

Positive: []
Neutral: ['Company', 'Caught', '.']
Negative: ['Counterfeit']
Overall sentiment dictionary is :  {'neg': 0.6, 'neu': 0.4, 'pos': 0.0, 'compound': -0.4588}
sentence was rated as  60.0 % Negative
sentence was rated as  40.0 % Neutral
sentence was rated as  0.0 % Positive
Sentence Overall Rated As Negative

It works perfectly for one word added within the lexicon. When I try to do the same using a CSV file by adding multiple words using the code below: I do not get the word Counterfeit added into my Vader Lexicon.

new_word={}
import csv
with open('Dictionary.csv', newline='') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        new_word[row['Word']] = int(row['Value'])
print(new_word)
sid_obj.lexicon.update(new_word)

The output for the above code is a dictionary which is updated to the lexicon. The dictionary looks like this (It has about 2000 words but I've only printed a few) It also consists of Counterfeit as a word:

{'CYBERATTACK': -2, 'CYBERATTACKS': -2, 'CYBERBULLYING': -2, 'CYBERCRIME': 
-2, 'CYBERCRIMES': -2, 'CYBERCRIMINAL': -2, 'CYBERCRIMINALS': -2, 
'MISCHARACTERIZATION': -2, 'MISCLASSIFICATIONS': -2, 'MISCLASSIFY': -2, 
'MISCOMMUNICATION': -2, 'MISPRICE': -2, 'MISPRICING': -2, 'STRICTLY': -2}

The output is as follows:

Positive: []
Neutral: ['Company', 'Caught', 'Counterfeit', '.']
Negative: []
Overall sentiment dictionary is :  {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}
sentence was rated as  0.0 % Negative
sentence was rated as  100.0 % Neutral
sentence was rated as  0.0 % Positive
Sentence Overall Rated As Neutral

Where am I going wrong when adding multiple words to the lexicon? The CSV file consists of two columns. One with the word and the other with the value as negative or positive number. Why does it still get identified as neutral? Any help will be appreciated. Thank you.

like image 607
Rathan M Avatar asked Jan 23 '26 12:01

Rathan M


1 Answers

Solved it, thanks. Issue was that I put up my text in dictionary in Upper case. It's always supposed to be stored in lower case. The dictionary words must be stored in lower case. Because Vader converts everything to lowercase before comparing.

like image 195
Rathan M Avatar answered Jan 26 '26 01:01

Rathan M