I would like to add words to the vader_lexicon.txt
to specify polarity scores to a word. What is the right way to do so?
I saw this file in AppData\Roaming\nltk_data\sentiment\vader_lexicon
. The file consists of the word, its polarity, intensity, and an array of 10 intensity scores given by "10 independent human raters". [1] However, when I edited it, nothing changed in the results of the following code:
from nltk.sentiment.vader import SentimentIntensityAnalyzer
sia = SentimentIntensityAnalyzer()
s = sia.polarity_scores("my string here")
I think that this text file is accessed by my code when I called SentimentIntensityAnalyzer's constructor. [2] Do you have any ideas on how I can edit a pre-made lexicon?
Sources:
[1] https://github.com/cjhutto/vaderSentiment
[2] http://www.nltk.org/api/nltk.sentiment.html
accuracy (with classification thresholds set at –0.05 and +0.05 for all normalized sentiment scores between -1 and 1), we can see that VADER (F1 = 0.96) actually outper- forms even individual human raters (F1 = 0.84) at correctly classifying the sentiment of tweets.
There are over 7500 tokens listed in VADER lexicon. (You can also add your own if you like.) VADER also considers grammatical and syntactical rules to measure intensity based on word order and sensitive relationships between terms.
VADER uses a combination of A sentiment lexicon is a list of lexical features (e.g., words) which are generally labeled according to their semantic orientation as either positive or negative. VADER not only tells about the Positivity and Negativity score but also tells us about how positive or negative a sentiment is.
VADER ( Valence Aware Dictionary for Sentiment Reasoning) is a model used for text sentiment analysis that is sensitive to both polarity (positive/negative) and intensity (strength) of emotion. It is available in the NLTK package and can be applied directly to unlabeled text data.
For anyone interested, this can also be achieved without having to manually edit the vader lexicon .txt file. Once loaded the lexicon is a normal dictionary with words as keys and scores as values. As provided by repoleved in this post:
from nltk.sentiment.vader import SentimentIntensityAnalyzer
new_words = {
'foo': 2.0,
'bar': -3.4,
}
SIA = SentimentIntensityAnalyzer()
SIA.lexicon.update(new_words)
If you wish to remove words, use the '.pop' function:
SIA = SentimentIntensityAnalyzer()
SIA.lexicon.pop('no')
I found the fix. I zipped the folder vader_lexicon
that contains the txt file and the changes I applied is now the one being accessed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With