I am building a project in python that needs to scrape huge and huge amounts of Twitter data. Something like 1 million users and all their tweets need to be scraped.
Previously I have used Tweepy and Twython, but hit the limit of Twitter very fast.
How do sentiment analysis companies etc. get their data? How do they get all those tweets? Do you buy this somewhere or build something that iterates through different proxies or something?
How do companies like Infochimps with for example Trst rank get all their data? * http://www.infochimps.com/datasets/twitter-census-trst-rank
If you want the latest tweets from specific users, Twitter offers the Streaming API.
The Streaming API is the real-time sample of the Twitter Firehose. This API is for those developers with data intensive needs. If you're looking to build a data mining product or are interested in analytics research, the Streaming API is most suited for such things.
If you're trying to access old information, the REST API with its severe request limits is the only way to go.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With