Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Good algorithm for sentiment analysis

I tried naive bayes classifier and it's working very bad. SVM works a little better but still horrible. Most of the papers which i read about SVM and naive bayes with some variations(n-gram, POS etc) but all of them gives results close to 50% (authors of articles talk about 80% and high but i cannt to get same accurate on real data).

Is there any more powerfull methods except lexixal analys? SVM and Bayes suppose that words independet. These approach called "bag of words". What if we suppose that words are associated?

For example: Use apriory algorithm to detect that if sentences contains "bad and horrible" then 70% probality that sentence is negative. Also we can use distance between words and so on.

Is it good idea or i'm inventing bicycle?

like image 822
Neir0 Avatar asked Jun 11 '12 14:06

Neir0


People also ask

Is LSTM good for sentiment analysis?

Long Short Term Memory is also known as LSTM that was introduced by Hocheriter & Schmindhuber in 1997. LSTM is a type of RNN network that can grasp long term dependence. They are widely used today for a variety of different tasks like speech recognition, text classification, sentimental analysis, etc.

Is naive Bayes good for sentiment analysis?

Naive Bayes is the simplest and fastest classification algorithm for a large chunk of data. In various applications such as spam filtering, text classification, sentiment analysis, and recommendation systems, Naive Bayes classifier is used successfully.


2 Answers

You're confusing a couple of concepts here. Neither Naive Bayes nor SVMs are tied to the bag of words approach. Neither SVMs nor the BOW approach have an independence assumption between terms.

Here's some things you can try:

  • include punctuation marks in your bags of words; esp. ! and ? can be helpful for sentiment analysis, while many feature extractors geared toward document classification throw them away
  • same for stop words: words like "I" and "my" may be indicative of subjective text
  • build a two-stage classifier; first determine whether any opinion is expressed, then whether it's positive or negative
  • try a quadratic kernel SVM instead of a linear one to capture interactions between features.
like image 93
Fred Foo Avatar answered Nov 05 '22 01:11

Fred Foo


Algorithms like SVM, Naive Bayes and maximum entropy ones are supervised machine learning algorithms and the output of your program depends on the training set you have provided. For large scale sentiment analysis I prefer using unsupervised learning method in which one can determine the sentiments of the adjectives by clustering documents into same-oriented parts, and label the clusters positive or negative. More information can be found out from this paper. http://icwsm.org/papers/3--Godbole-Srinivasaiah-Skiena.pdf

Hope this helps you in your work :)

like image 44
Aravind Asok Avatar answered Nov 05 '22 00:11

Aravind Asok