how are sentiment analysis computed in blob

Tags:

sentiment-analysis

I use the following to compute the sentiment of 200 short sentences. I did not use a training data set:

for sentence in textblob.sentences: print(sentence.sentiment)

The analysis returns two values: polarity and subjectivity. From what I read online, the polarity score is a float within the range [-1.0, 1.0] where 0 indicates neutral, +1 a very positive attitude and -1 a very negative attitude. The subjectivity is a float within the range [0.0, 1.0] where 0.0 is very objective and 1.0 is very subjective.

So, now my question: How are those scores computed?

I have some zeros for the polarity score of almost half of the phrases and I am wondering whether the zero indicates neutrality or rather the fact that the phrase does not feature words that have a polarity. I am wondering the same question for another sentiment analyser:NaiveBayesAnalyzer.

Thank you for your help!
Marie

659

asked Dec 29 '15 20:12

MarieJ

Video Answer

2 Answers

According to TextBlob creator, Steven Loria,TextBlob's sentiment analyzer delegates to pattern.en's sentiment module. Pattern.en itself uses a dictionary-based approach with a few heuristics to handle, e.g. negation. You can find the source here, which is a vendorized version of pattern.en's text module, with minor tweaks for Python 3 compatibility.

127

answered Sep 23 '22 17:09

Pie-ton

The TextBlob NaiveBayesAnalyzer is apparently based on the Stanford NLTK. The Naive Bayes algorithm in general is explained here: A simple explanation of Naive Bayes Classification

and its application to sentiment and objectivity is described here: http://nlp.stanford.edu/courses/cs224n/2009/fp/24.pdf

Basically you're right that certain words will be labeled something like "40% positive / 60% negative" based on how they were used in some body of training data (for the Stanford NLTK, the training data was movie reviews). Then the scores of all words in your sentence get multiplied to produce the sentence score.

I haven't tested, but I expect that if the library returns exactly 0.0, then your sentence didn't contain any words that had a polarity in the NLTK training set. I suspect the researchers didn't include them because 1) they were too rare in the training data or 2) they were known to be meaningless (such as "the", "a", "and", etc.).

That goes for the Naive Bayes analyzer. Regarding the PatternAnalyzer, the TextBlob docs say it's based on the "pattern" library, but it doesn't seem to document how it works. I suspect something similar is happening though.

answered Sep 21 '22 17:09

Luke

Related questions
                            
                                Rate-limiting python decorator
                            
                                How to make python script press 'enter' when prompted on Shell
                            
                                Writing to HTML5 localStorage from python/Flask app
                            
                                How to handle csv file with duplicate fieldnames when reading with csv.DictReader?
                            
                                how do I get Tkinter askopenfilename() to open on top of other windows?
                            
                                TypeError: 'CommandCursor' object has no attribute '__getitem__'
                            
                                Select GPU during execution in Theano
                            
                                Should python imports take this long?
                            
                                How can I get return value ( string ) from called Python by BatchFile [duplicate]
                            
                                How to hide a turtle icon/pointer in Python
                            
                                XlsxWriter error for percent format
                            
                                PyQt OpenGL: drawing simple scenes
                            
                                Python Pandas <pandas.core.groupby.DataFrameGroupBy object at ...>
                            
                                Is there a way to gain access to the class of a method when all you have is a callable
                            
                                Supervisor - Can't start supervisorctl as root or user (User is set in config)
                            
                                Get image mode PIL Python
                            
                                python 2.7 windows silent installer (.msi) - command-line option to set the path?
                            
                                Django: Save user uploads in seperate folders
                            
                                Matplotlib create real time animated graph
                            
                                Remove items from list by using python list comprehensions

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With