Best Algorithmic Approach to Sentiment Analysis [closed]

Tags:

My requirement is taking in news articles and determining if they are positive or negative about a subject. I am taking the approach outlined below, but I keep reading NLP may be of use here. All that I have read has pointed at NLP detecting opinion from fact, which I don't think would matter much in my case. I'm wondering two things:

1) Why wouldn't my algorithm work and/or how can I improve it? ( I know sarcasm would probably be a pitfall, but again I don't see that occurring much in the type of news we will be getting)

2) How would NLP help, why should I use it?

My algorithmic approach (I have dictionaries of positive, negative, and negation words):

1) Count number of positive and negative words in article

2) If a negation word is found with 2 or 3 words of the positive or negative word, (ie: NOT the best) negate the score.

3) Multiply the scores by weights that have been manually assigned to each word. (1.0 to start)

4) Add up the totals for positive and negative to get the sentiment score.

896

asked Nov 16 '10 21:11

user387049

1 Answers

I don't think there's anything particularly wrong with your algorithm, it's a fairly straightforward and practical way to go, but there are a lot of situations where it will get make mistakes.

Ambiguous sentiment words - "This product works terribly" vs. "This product is terribly good"
Missed negations - "I would never in a millions years say that this product is worth buying"
Quoted/Indirect text - "My dad says this product is terrible, but I disagree"
Comparisons - "This product is about as useful as a hole in the head"
Anything subtle - "This product is ugly, slow and uninspiring, but it's the only thing on the market that does the job"

I'm using product reviews for examples instead of news stories, but you get the idea. In fact, news articles are probably harder because they will often try to show both sides of an argument and tend to use a certain style to convey a point. The final example is quite common in opinion pieces, for example.

As far as NLP helping you with any of this, word sense disambiguation (or even just part-of-speech tagging) may help with (1), syntactic parsing might help with the long range dependencies in (2), some kind of chunking might help with (3). It's all research level work though, there's nothing that I know of that you can directly use. Issues (4) and (5) are a lot harder, I throw up my hands and give up at this point.

I'd stick with the approach you have and look at the output carefully to see if it is doing what you want. Of course that then raises the issue of what you want you understand the definition of "sentiment" to be in the first place...

122

answered Sep 23 '22 14:09

Stompchicken

Related questions
                            
                                LDA model generates different topics everytime i train on the same corpus
                            
                                Tools for text simplification (Java) [closed]
                            
                                How to use OpenNLP with Java?
                            
                                Unable to load the spacy model 'en_core_web_lg' on Google colab
                            
                                Interpreting negative Word2Vec similarity from gensim
                            
                                Algorithm for Negating Sentences
                            
                                Using Word2Vec for topic modeling
                            
                                nltk sentence tokenizer, consider new lines as sentence boundary
                            
                                Getting feature names from within a FeatureUnion + Pipeline
                            
                                NLTK - Counting Frequency of Bigram
                            
                                Gensim: What is difference between word2vec and doc2vec?
                            
                                How areTF-IDF calculated by the scikit-learn TfidfVectorizer
                            
                                Multi-term named entities in Stanford Named Entity Recognizer
                            
                                NLTK for Persian
                            
                                How to identify the subject of a sentence?
                            
                                Need a python module for stemming of text documents
                            
                                In Natural language processing, what is the purpose of chunking?
                            
                                Is there a tutorial about giza++? [closed]
                            
                                How to get all article pages under a Wikipedia Category and its sub-categories?
                            
                                How do I test whether an nltk resource is already installed on the machine running my code?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Best Algorithmic Approach to Sentiment Analysis [closed]

Tags:

nlp

sentiment-analysis

user387049

People also ask

1 Answers

Stompchicken

Recent Activity

Donate For Us