Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Avoid 420s with Streaming API?

I have a python script that hooks into the Twitter Streaming API using basic authentication and making use of the tweetstream module.

Im gathering around 10 tweets a minute.
I was getting intermittent disconnections, so currently logging how often they are occurring.

I have been hitting my rate limit and getting 420 HTTP errors.

I know that for the search API, you get a higher quota with using OAuth authentication. For streaming, I could not find any reference to differences in rate limiting between basic and OAuth. Anyhow, it would appear that the python Tweetstream I am using, does not support this with the streaming API.

I noticed the Ruby version of Tweetstream supports OAuth, but I am doing this project as a learning experience for python.

From reading the Twitter help, it talks of 'backoff strategies' and mentions:

it is essential to stop further connection attempts for a few minutes if a HTTP 420 response is received.

I am no longer getting the errors, but been trying to formulate better logic in my code to avoid getting these errors permanently.

My current proposal is below, which now waits for 200s before attempting to reconnect.

while True:
    try:
        with tweetstream.FilterStream(uname, passwd, locations=extent) as stream:
            # do stuff
    except tweetstream.ConnectionError as e:
     print e.message + " time: " + datetime.now
     time.sleep(200)
     pass
    except tweetstream.AuthenticationError as e:
     now = datetime.datetime.now()
     print e.message  + " time: " + str(now)
     pass

My question is - Is this a good way to get around receiving the 420 errors from Twitter? Those that are more familiar with the Twitter API, can you recommend an approach?

like image 876
jakc Avatar asked Nov 18 '12 09:11

jakc


1 Answers

420

Rate Limited. Possible reasons are:

Too many login attempts in a short period of time. Running too many copies of the same application authenticating with the same account name.

You should not get a rate limiting error for such a less rate of tweet streaming (10 tweets per minute), actually rate-limiting is not applied for streamers because twitter doesn't give you more tweets than you can have, most probably you are getting this error because of too many login attempts in a short period. So it is a good idea to wait for some time (I do wait for 10 seconds between each disconnect, which happens quite rarely). Make sure that your streamer is not getting interrupted because of internal programming exceptions rather than Twitter exceptions.Also you should take a look at the suggestions below.

You should check only one stream is running through same ip. Twitter allows one streamer to run per ip and per basic authentication. So ensure that you are running a unique stream from a particular ip and the the credentials you have given for the oauth authentication is only used for this stream. Then you would not get 420 errors.

But for some reason if your streamer is getting interrupted either through Twitter exceptions or internal programming exceptions, you should wait for a while before re-connecting to prevent getting more exceptions. Twitter also returns how much time you need to wait before reconnecting again in response code headers (the below one is for search but it should also be included in for streaming).

An application that exceeds the rate limitations of the Search API will receive an HTTP 420 response code. It is best practice to watch for this error condition and honor the Retry-After header which is returned. The Retry-After header's value is the number of seconds your application should wait before requesting date from the Search API again.

like image 122
cubbuk Avatar answered Oct 04 '22 03:10

cubbuk