Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Managing Tweepy API Search

Please forgive me if this is a gross repeat of a question previously answered elsewhere, but I am lost on how to use the tweepy API search function. Is there any documentation available on how to search for tweets using the api.search() function?

Is there any way I can control features such as number of tweets returned, results type etc.?

The results seem to max out at 100 for some reason.

the code snippet I use is as follows

searched_tweets = self.api.search(q=query,rpp=100,count=1000)

like image 948
user3075934 Avatar asked Mar 18 '14 03:03

user3075934


1 Answers

I originally worked out a solution based on Yuva Raj's suggestion to use additional parameters in GET search/tweets - the max_id parameter in conjunction with the id of the last tweet returned in each iteration of a loop that also checks for the occurrence of a TweepError.

However, I discovered there is a far simpler way to solve the problem using a tweepy.Cursor (see tweepy Cursor tutorial for more on using Cursor).

The following code fetches the most recent 1000 mentions of 'python'.

import tweepy # assuming twitter_authentication.py contains each of the 4 oauth elements (1 per line) from twitter_authentication import API_KEY, API_SECRET, ACCESS_TOKEN, ACCESS_TOKEN_SECRET  auth = tweepy.OAuthHandler(API_KEY, API_SECRET) auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)  api = tweepy.API(auth)  query = 'python' max_tweets = 1000 searched_tweets = [status for status in tweepy.Cursor(api.search, q=query).items(max_tweets)] 

Update: in response to Andre Petre's comment about potential memory consumption issues with tweepy.Cursor, I'll include my original solution, replacing the single statement list comprehension used above to compute searched_tweets with the following:

searched_tweets = [] last_id = -1 while len(searched_tweets) < max_tweets:     count = max_tweets - len(searched_tweets)     try:         new_tweets = api.search(q=query, count=count, max_id=str(last_id - 1))         if not new_tweets:             break         searched_tweets.extend(new_tweets)         last_id = new_tweets[-1].id     except tweepy.TweepError as e:         # depending on TweepError.code, one may want to retry or wait         # to keep things simple, we will give up on an error         break 
like image 149
gumption Avatar answered Sep 20 '22 20:09

gumption