Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to restart tweepy script in case of error?

I have a python script that continuously stores tweets related to tracked keywords to a file. However, the script tends to crash repeatedly due to an error appended below. How do I edit the script so that it automatically restarts? I've seen numerous solutions including this (Restarting a program after exception) but I'm not sure how to implement it in my script.

import sys
import tweepy
import json
import os

consumer_key=""
consumer_secret=""
access_key = ""
access_secret = ""

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
# directory that you want to save the json file
os.chdir("C:\Users\json_files")
# name of json file you want to create/open and append json to
save_file = open("12may.json", 'a')

class CustomStreamListener(tweepy.StreamListener):
    def __init__(self, api):
        self.api = api
        super(tweepy.StreamListener, self).__init__()

        # self.list_of_tweets = []

    def on_data(self, tweet):
        print tweet
        save_file.write(str(tweet))

    def on_error(self, status_code):
        print >> sys.stderr, 'Encountered error with status code:', status_code
        return True # Don't kill the stream
        print "Stream restarted"

    def on_timeout(self):
        print >> sys.stderr, 'Timeout...'
        return True # Don't kill the stream
        print "Stream restarted"

sapi = tweepy.streaming.Stream(auth, CustomStreamListener(api))
sapi.filter(track=["test"])

===========================================================================

Traceback (most recent call last):
  File "C:\Users\tweets_to_json.py", line 41, in <module>
    sapi.filter(track=["test"])
  File "C:\Python27\lib\site-packages\tweepy-2.3-py2.7.egg\tweepy\streaming.py", line 316, in filter
    self._start(async)
  File "C:\Python27\lib\site-packages\tweepy-2.3-py2.7.egg\tweepy\streaming.py", line 235, in _start
    self._run()
  File "C:\Python27\lib\site-packages\tweepy-2.3-py2.7.egg\tweepy\streaming.py", line 165, in _run
    self._read_loop(resp)
  File "C:\Python27\lib\site-packages\tweepy-2.3-py2.7.egg\tweepy\streaming.py", line 206, in _read_loop
    for c in resp.iter_content():
  File "C:\Python27\lib\site-packages\requests-1.2.3-py2.7.egg\requests\models.py", line 541, in generate
    chunk = self.raw.read(chunk_size, decode_content=True)
  File "C:\Python27\lib\site-packages\requests-1.2.3-py2.7.egg\requests\packages\urllib3\response.py", line 171, in read
    data = self._fp.read(amt)
  File "C:\Python27\lib\httplib.py", line 543, in read
    return self._read_chunked(amt)
  File "C:\Python27\lib\httplib.py", line 603, in _read_chunked
    value.append(self._safe_read(amt))
  File "C:\Python27\lib\httplib.py", line 660, in _safe_read
    raise IncompleteRead(''.join(s), amt)
IncompleteRead: IncompleteRead(0 bytes read, 1 more expected)
like image 761
Eugene Yan Avatar asked May 12 '14 05:05

Eugene Yan


People also ask

How do I restart a Python script?

How do I restart Python shell? To execute a file in IDLE, simply press the F5 key on your keyboard. You can also select Run → Run Module from the menu bar. Either option will restart the Python interpreter and then run the code that you've written with a fresh interpreter.

How can I get more than 100 tweets on Tweepy?

If you need more than 100 Tweets, you have to use the paginator method and specify the limit i.e. the total number of Tweets that you want. Replace limit=1000 with the maximum number of tweets you want. Replace the limit=1000 with the maximum number of tweets you want (gist).

What is wait on rate limit Tweepy?

But keep in mind that Twitter levies a rate limit on the number of requests made to the Twitter API. To be precise, 900 requests/15 minutes are allowed; Twitter feeds anything above that an error.

What is Tweepy StreamListener?

Streaming with Tweepy comprises of three objects; Stream, StreamListener, OAuthHandler. The latter simply handles API authentication and requires the unique keys from the creation of your Twitter app. The StreamListener class is used to define how each incoming tweet should be handled.


2 Answers

Figured out how to incorporate the while/try loop by writing a new function for the stream:

def start_stream():
    while True:
        try:
            sapi = tweepy.streaming.Stream(auth, CustomStreamListener(api))
            sapi.filter(track=["Samsung", "s4", "s5", "note" "3", "HTC", "Sony", "Xperia", "Blackberry", "q5", "q10", "z10", "Nokia", "Lumia", "Nexus", "LG", "Huawei", "Motorola"])
        except: 
            continue

start_stream()

I tested the auto restart by manually interrupting the program with CMD + C. Nonetheless, happy to hear of better ways to test such functionality.

like image 183
Eugene Yan Avatar answered Sep 19 '22 16:09

Eugene Yan


I had this problem occurring recently and wanted to share more detailed information about it.

The error that's causing it is because the streaming filter that's chosen is too broad test. Therefore you receive streams at a faster rate than you can accept which causes an IncompleRead error.

This can be fixed by either refining the search or by using a more specific exception:

from http.client import IncompleteRead
...
try:
    sapi = tweepy.streaming.Stream(auth, CustomStreamListener(api))
    sapi.filter(track=["test"])
except IncompleRead:
    pass
like image 28
Leb Avatar answered Sep 21 '22 16:09

Leb