Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get large list of followers Tweepy

I'm trying to use Tweepy to get the full list of followers from an account with like 500k followers, and I have a code that gives me the usernames for smaller accounts, like under 100, but if I get one that's even like 110 followers, it doesn't work. Any help figuring out how to make it work with larger numbers is greatly appreciated!

Here's the code I have right now:

import tweepy
import time

key1 = "..."
key2 = "..."
key3 = "..."
key4 = "..."

accountvar = raw_input("Account name: ")

auth = tweepy.OAuthHandler(key1, key2)
auth.set_access_token(key3, key4)

api = tweepy.API(auth)

ids = []
for page in tweepy.Cursor(api.followers_ids, screen_name=accountvar).pages():
     ids.extend(page)
     time.sleep(60)

users = api.lookup_users(user_ids=ids)
for u in users:
     print u.screen_name

The error I keep getting is:

Traceback (most recent call last):
  File "test.py", line 24, in <module>
    users = api.lookup_users(user_ids=ids)
  File "/Library/Python/2.7/site-packages/tweepy/api.py", line 321, in lookup_users
    return self._lookup_users(post_data=post_data)
  File "/Library/Python/2.7/site-packages/tweepy/binder.py", line 239, in _call
    return method.execute()
  File "/Library/Python/2.7/site-packages/tweepy/binder.py", line 223, in execute
    raise TweepError(error_msg, resp)
tweepy.error.TweepError: [{u'message': u'Too many terms specified in query.', u'code': 18}]

I've looked at a bunch of other questions about this type of question, but none I could find had a solution that worked for me, but if someone has a link to a solution, please send it to me!

like image 418
mataxu Avatar asked Jun 23 '15 10:06

mataxu


People also ask

How can I get more than 100 tweets on Tweepy?

If you need more than 100 Tweets, you have to use the paginator method and specify the limit i.e. the total number of Tweets that you want. Replace limit=1000 with the maximum number of tweets you want. Replace the limit=1000 with the maximum number of tweets you want (gist).

What is the limit for Tweepy?

But keep in mind that Twitter levies a rate limit on the number of requests made to the Twitter API. To be precise, 900 requests/15 minutes are allowed; Twitter feeds anything above that an error.

Does Tweepy work with v2?

If you want to retweet a Tweet with Tweepy using the Twitter API v2, you will need to make sure that you have your consumer key and consumer secret, along with your access token and access token secret, that are created with Read and Write permissions (similar to the previous example).

What is RPP in Tweepy?

rpp – The number of tweets to return per page, up to a max of 100.

How do I get followers in tweepy API?

API.followers () The followers () method of the API class in Tweepy module is used to get the specified user’s followers ordered in which they were added. Syntax : API.followers (id / user_id / screen_name) Parameters : Only use one of the 3 options:

How can I get 3000 Twitter followers per 15 minutes?

You can harvest 3,000 users per 15 minutes by adding a count parameter: users = tweepy.Cursor (api.followers, screen_name=accountvar, count=200).items () This will call the Twitter API 15 times as per your version, but rather than the default count=20, each API call will return 200 (i.e. you get 3000 rather than 300).

How to get the number of followers of a profile?

In the above mentioned profile the number of followers are : 17.8K (17, 800+) Identify the user ID or the screen name of the profile. Get the User object of the profile using the get_user () method with the user ID or the screen name. From this object, fetch the followers_count attribute present in it. We will use the user ID to fetch the user.

How do I get a list of a target account’s followers?

This tool uses Tweepy to connect to the Twitter API. In order to enumerate a target account’s followers, I like to start by using Tweepy’s followers_ids () function to get a list of Twitter ids of accounts that are following the target account.


4 Answers

I actually figured it out, so I'll post the solution here just for reference.

import tweepy
import time

key1 = "..."
key2 = "..."
key3 = "..."
key4 = "..."

accountvar = raw_input("Account name: ")

auth = tweepy.OAuthHandler(key1, key2)
auth.set_access_token(key3, key4)

api = tweepy.API(auth)

users = tweepy.Cursor(api.followers, screen_name=accountvar).items()

while True:
    try:
        user = next(users)
    except tweepy.TweepError:
        time.sleep(60*15)
        user = next(users)
    except StopIteration:
        break
    print "@" + user.screen_name

This stops after every 300 names for 15 minutes, and then continues. This makes sure that it doesn't run into problems. This will obviously take ages for large accounts, but as Leb mentioned:

The twitter API only allows 100 users to be searched for at a time...[so] what you'll need to do is iterate through each 100 users but staying within the rate limit.

You basically just have to leave the program running if you want the next set. I don't know why mine is giving 300 at a time instead of 100, but as I mentioned about my program earlier, it was giving me 100 earlier as well.

Hope this helps anyone else that had the same problem as me, and shoutout to Leb for reminding me to focus on the rate limit.

like image 143
mataxu Avatar answered Oct 20 '22 01:10

mataxu


To extend upon this:

You can harvest 3,000 users per 15 minutes by adding a count parameter:

users = tweepy.Cursor(api.followers, screen_name=accountvar, count=200).items()

This will call the Twitter API 15 times as per your version, but rather than the default count=20, each API call will return 200 (i.e. you get 3000 rather than 300).

like image 30
Alec Avatar answered Oct 19 '22 23:10

Alec


Twitter provides two ways to fetch the followers: -

  1. Fetching full followers list (using followers/list in Twitter API or api.followers in tweepy) - Alec and mataxu have provided the approach to fetch using this way in their answers. The rate limit with this is you can get at most 200 * 15 = 3000 followers in every 15 minutes window.
  2. Second approach involves two stages:-
    a) Fetching only the followers ids first (using followers/ids in Twitter API or api.followers_ids in tweepy).you can get 5000 * 15 = 75K follower ids in each 15 minutes window.

    b) Looking up their usernames or other data (using users/lookup in twitter api or api.lookup_users in tweepy). This has rate limitation of about 100 * 180 = 18K lookups each 15 minute window.

Considering the rate limits, Second approach gives followers data 6 times faster when compared to first approach. Below is the code which could be used to do it using 2nd approach:-

#First, Make sure you have set wait_on_rate_limit to True while connecting through Tweepy
api = tweepy.API(auth, wait_on_rate_limit=True,wait_on_rate_limit_notify=True)

#Below code will request for 5000 follower ids in one request and therefore will give 75K ids in every 15 minute window (as 15 requests could be made in each window).
followerids =[]
for user in tweepy.Cursor(api.followers_ids, screen_name=accountvar,count=5000).items():
    followerids.append(user)    
print (len(followerids))

#Below function could be used to make lookup requests for ids 100 at a time leading to 18K lookups in each 15 minute window
def get_usernames(userids, api):
    fullusers = []
    u_count = len(userids)
    print(u_count)
    try:
        for i in range(int(u_count/100) + 1):            
            end_loc = min((i + 1) * 100, u_count)
            fullusers.extend(
                api.lookup_users(user_ids=userids[i * 100:end_loc])                
            )
        return fullusers
    except:
        import traceback
        traceback.print_exc()
        print ('Something went wrong, quitting...')

#Calling the function below with the list of followeids and tweepy api connection details
fullusers = get_usernames(followerids,api)

Hope this helps. Similiar approach could be followed for fetching friends details by using api.friends_ids inplace of api.followers_ids

If you need more resources for rate limit comparison and for 2nd approach, check below links:-

  • https://github.com/tweepy/tweepy/issues/627

  • https://labsblog.f-secure.com/2018/02/27/how-to-get-twitter-follower-data-using-python-and-tweepy/

like image 31
Himanshu Punetha Avatar answered Oct 20 '22 01:10

Himanshu Punetha


The twitter API only allows 100 users to be searched for at a time. That's why no matter how many you input to it you'll get 100. The followers_id is giving you the correct number of users but you're being limited by GET users/lookup

What you'll need to do is iterate through each 100 users but staying within the rate limit.

like image 23
Leb Avatar answered Oct 20 '22 00:10

Leb