Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sentiment Analysis get tweets match to the search query & do analysis

I want to perform sentiment analysis on twitter. I dont want to store any of the tweets but perform analysis on them such as tweets which says positive stuff about a particular hashtag etc. The problem I have here is that accessing the tweets is too slow. What would be the way to access tweets and analyze them as well and give results to user. A good example is here: http://www.sentiment140.com/search?query=hello&hl=en

Although the guy in above link is only taking 10 tweets and analyzing them. I want to know how I can do that accessing the api so that user can get quick response.

Even this is a good example: http://snapbird.org/ Even if I know how I can access the tweets and automatically analyze them without having to store them anywhere would be a perfect solution.

Please note, I am just asking about how tweets can be accessed without storage so that I can directly perform analysis to users and show in my web app.

like image 707
fscore Avatar asked Apr 14 '14 01:04

fscore


2 Answers

Sentiment140 is on GoogleApp Engine, so you can bet they are using Python to do the task. Python is really good for this and has great libraries for Sentiment Analysis (NLTK) and consume the twitter APIs. There are also great tutorials out there. You could follow this steps:

  1. Grab the last N tweets for your keyword (with tweepy lib) Example provided.
  2. Store them in an array
  3. Pass the array to a Bayesian Classifier built with Python's NLTK [see links]
  4. Get the result of the analysis in near real-time
  5. Present them to the user if you want (in a Django/Flask template, etc)


Getting N tweets from the twitter API

Example with tweepy (returns the last 10 tweet with the keyword 'Lionel Messi')

#!/usr/bin/env python

import tweepy
ckey = 'xxx'
csecret = 'xxx'
atoken = 'xxx'
asecret = 'xxx'


auth = tweepy.OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)

api = tweepy.API(auth)

tweets = []                             # You pass this array to the Bayesian Classifier
for tweet in tweepy.Cursor(api.search,
                       q="Lionel Messi",
                       result_type="recent",
                       include_entities=True,
                       lang="en").items(10):
    print tweet.created_at, tweet.text  
    tweets.append(tweet.text)           # Store the tweets in your array


Building a Naive Bayes Classifier

Examples about how to build your classifier and nice resources:

http://ravikiranj.net/drupal/201205/code/machine-learning/how-build-twitter-sentiment-analyzer https://github.com/ravikiranj/twitter-sentiment-analyzer

Please bear in mind that you'll have to train and fine-tune your bots/classifiers. You've got more info and boilerplate code in these resources.

PS: Alternatively you can pass your array/dict of tweets to services like a text-processing.com's API and they'll do the Sentiment Analysis for you...

http://text-processing.com/demo/sentiment/
https://www.mashape.com/japerk/text-processing/pricing#!documentation


Showing the results in a simple website

For this task you can use flask-tweepy. Just read their demo and you'll see how easy is to incorporate above's scripts inside flask and render the results in a view.


Hope it helps!

like image 149
sdude Avatar answered Nov 15 '22 07:11

sdude


You want to use Twitter's Streaming API.

With this, you can get a near real-time feed from Twitter, filtered to whatever search text you want.

You won't need to make multiple request, or store the results; just stream and analyse.

like image 20
Terence Eden Avatar answered Nov 15 '22 09:11

Terence Eden