Here is my current code
from twitter import *
t = Twitter(auth=OAuth(TWITTER_CONSUMER_KEY, TWITTER_CONSUMER_SECRET,
ACCESS_TOKEN, ACCESS_TOKEN_SECRET))
t.statuses.home_timeline()
query=raw_input("enter the query \n")
data = t.search.tweets(q=query)
for i in range (0,1000):
print data['statuses'][i]['text']
print '\n'
Here, I fetch tweets from all the languages. Is there a way to restrict myself to fetching tweets only in English?
We can filter according to the type of content and language of the tweets and/or the accounts that have been mentioned. We can also search tweets by date and location.
Getting Started Set up a Twitter account if you don't have one already. Using your Twitter account, you will need to apply for Developer Access and then create an application that will generate the API credentials that you will use to access Twitter from Python . Import the tweepy package.
Tweepy is a Python library for integrating with the Twitter API. Because Tweepy is connected with the Twitter API, you can perform complex queries in addition to scraping tweets. It enables you to take advantage of all of the Twitter API's capabilities.
There are at least 4 ways... I put them in the order of simplicity.
After you collect the tweets, the json output has a key/value pair that identifies the language. So you can use something like this to take all language tweets and select only the ones that are from English accounts.
for i in range (0,1000):
if data['statuses'][i][u'lang']==u'en':
print data['statuses'][i]['text']
print '\n'
Another way to collect only tweets that are identified in English, you can use the optional 'lang' parameter to request from the API only English (self-idenfitied) tweets. See details here. If you are using the python-twitter library, you can set the 'lang' parameter in twitter.py.
Use a language recognition package like guess-language.
Or if you want to recognize English text without using the self-identified twitter data (i.e. a chinese account that is writing in English), then you have to do Natural Language Processing. One option. This method will recognize common English words and then mark the text as English.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With