Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get twitter followers using tweepy and multiple API keys

I have multiple twitter dev keys that I am using to get followers from a list of handles. There are two ways I can do this but have a problem with both. The first:

try:
    ....
    for user in tweepy.Cursor(api.followers, screen_name=screenName).items():
    ....
except tweepy.TweepError as e:

    errorCode = e.message[0]['code']
        if errorCode == 88:
            print "Rate limit exceeded."
            rotateKeys()

The issue here is that every time I rotate keys, the for loop starts from scratch and starts getting the followers again. I tried to get around this but splitting the for loop:

try:
    items = tweepy.Cursor(api.followers, screen_name=s).items()

I then loop through them manually using next(items)

However rotating api keys does not work as the initial call was done with the first API code and will always try to use that one.

I need a way to rotate keys and continue from were the previous left of.

like image 995
Martyn Avatar asked Dec 06 '13 15:12

Martyn


People also ask

How can I get more than 100 tweets on Tweepy?

If you need more than 100 Tweets, you have to use the paginator method and specify the limit i.e. the total number of Tweets that you want. Replace limit=1000 with the maximum number of tweets you want. Replace the limit=1000 with the maximum number of tweets you want (gist).

How many tweets can be extracted using Tweepy?

Tweepy provides the convenient Cursor interface to iterate through different types of objects. Twitter allows a maximum of 3200 tweets for extraction. These all are the prerequisite that have to be used before getting tweets of a user.

What is the limit for Tweepy?

But keep in mind that Twitter levies a rate limit on the number of requests made to the Twitter API. To be precise, 900 requests/15 minutes are allowed; Twitter feeds anything above that an error.


2 Answers

I actually had to end up abandoning the cursored method in favor of manually setting the next cursor. The nice thing about this is that the "non-cursored" method returns the previous and next cursor as part of it's function.

Here's how I achieved what you are going for (note: adding a try/catch is probably in order):

users = ['user_one', 'user_two', 'user_three']

current_profile = 9 # I HAVE TEN IN AN ARRAY

tweepy_api = get_api(auth_profiles[current_profile]) #A FUNCTION I CREATED TO REINITIALIZE API'S

for user in users:

    next_cursor = -1 # START EVERY NEW USER RETRIEVAL WITH -1

    print 'CURRENT USER:', user, 'STARTING CURSOR:', next_cursor

    while next_cursor: # THAT IS, WHILE CURSOR IS NOT ZERO

        print 'AUTH PROFILE', current_profile, 'CURRENT CURSOR:', next_cursor

        # RETURNS A TUPLE WITH ELEMENT[0] A LIST OF IDS, ELEMENT [1][0] PREVIOUS CURSOR, AND ELEMENT[1][1] NEXT CURSOR
        ids, cursors = tweepy_api.followers_ids(screen_name=user, count=5000, cursor=next_cursor)

        next_cursor = cursors[1] # STORE NEXT CURSOR

        # FUNCTION I CREATED TO GET STATUS FROM API.rate_limit_status()
        status = get_rate_limit_status(tweepy_api, '/followers/ids')

        print 'ID\'S RETRIEVED:', len(ids), 'NEXT CURSOR:', cursors[1], 'REMAINING:', status['remaining']

        if not status['remaining']: # IF STATUS IS REMAINING IS ZERO

            print ''
            print 'RATE LIMIT REACHED'

            if current_profile < len(auth_profiles) - 1: # IF THE CURRENT PROFILE IS LESS THAN NINE (IN MY CASE)

                print 'INCREMENTING CURRENT PROFILE:', current_profile, '<', len(auth_profiles) - 1

                current_profile += 1 # INCREMENT THE PROFILE

                print 'CURRENT PROFILE:', current_profile

            else: # ELSE, IT MUST EQUAL NINE (COULD BE NEG I SUPPOSE BUT...)

                print 'RESETTING CURRENT PROFILE TO ZERO:', current_profile, '=', len(auth_profiles) - 1

                current_profile = 0 # RESET CURRENT PROFILE TO THE BEGINNING

                print 'CURRENT PROFILE:', current_profile

            tweepy_api = get_api(auth_profiles[current_profile]) # GET NEW TWEEPY API WITH NEW AUTH
            print ''

The output should be something like this (I've removed some of the print statements for simplicity):

CURRENT USER: user_one STARTING CURSOR: -1
AUTH PROFILE 9 CURRENT CURSOR: -1

ID'S RETRIEVED: 5000 NEXT CURSOR: 1594511885763407081 REMAINING: 14
…
ID'S RETRIEVED: 5000 NEXT CURSOR: 1582249691352919104 REMAINING: 0

RATE LIMIT REACHED
RESETTING CURRENT PROFILE TO ZERO: 9 = 9
CURRENT PROFILE: 0

ID'S RETRIEVED: 5000 NEXT CURSOR: 1580277475971792716 REMAINING: 14
…
ID'S RETRIEVED: 4903 NEXT CURSOR: 0 REMAINING: 7

CURRENT USER: user_two STARTING CURSOR: -1
AUTH PROFILE 0 CURRENT CURSOR: -1

ID'S RETRIEVED: 5000 NEXT CURSOR: 1592820762836029887 REMAINING: 6
…
ID'S RETRIEVED: 5000 NEXT CURSOR: 1592737463603654258 REMAINING: 0

RATE LIMIT REACHED
INCREMENTING CURRENT PROFILE: 0 < 9
CURRENT PROFILE: 1

As a side note, if you are going to use a cursored version, at least in Tweepy 3.5.0 the prev_cursor and next_cursor are stored in cursor.iterator.next_cursor, cursor.iterator.prev_cursor. I think this is also the case for 3.6.0 (see Cursor and CursorIterator in cursor.py)

For me, cursor.page_iterator.next_cursor returns:

AttributeError: 'Cursor' object has no attribute 'page_iterator'
like image 192
crld Avatar answered Nov 05 '22 11:11

crld


You can get the cursor that was used when the rate limit occured through the next_cursor variable on the iterator being used. When you create a new Cursor using the new API instance, you can pass the previous cursor as a parameter:

current_cursor = cursor.iterator.next_cursor
# re-create the cursor using the new api instance
cursor = tweepy.Cursor(api.followers, screen_name=s, cursor=current_cursor)
like image 32
Aaron Hill Avatar answered Nov 05 '22 11:11

Aaron Hill