I've been reading a lot of articles that explain the need for an initial set of texts that are classified as either 'positive' or 'negative' before a sentiment analysis system will really work.
My question is: Has anyone attempted just doing a rudimentary check of 'positive' adjectives vs 'negative' adjectives, taking into account any simple negators to avoid classing 'not happy' as positive? If so, are there any articles that discuss just why this strategy isn't realistic?
Usually, it is given a fake supervised task, such as predicting word based on words that surround it, or predict surrounding words based on a given word (see: word2vec), or predict next word/sentence based on previous words/sentences (transformer models).
Sentiment analysis can be performed by implementing one of the two different approaches using machine learning — unsupervised or supervised. As it is known sentiments can be either positive or negative.
Most of the approaches I have found for sentiment analysis are supervised (they need labeled data to train a classifier).
VADER is an unsupervised learning algorithm widely used in Sentiment Analysis. No training — No classification — No pickling. Just works out of the box.
A classic paper by Peter Turney (2002) explains a method to do unsupervised sentiment analysis (positive/negative classification) using only the words excellent and poor as a seed set. Turney uses the mutual information of other words with these two adjectives to achieve an accuracy of 74%.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With