Tried to store twitter stream data into MongoDB. The code is pretty much a copy from http://stats.seandolinar.com/collecting-twitter-data-storing-tweets-in-mongodb/ but always show an error. If I tried to print out the data, it shown json file continuously grow, but it seems never ends in spite of the time limit for the while loop.!
class listener(StreamListener):
def __init__(self, start_time, time_limit=60):
self.time = start_time
self.limit = time_limit
def on_data(self, data):
while (time.time() - self.time) < self.limit:
try:
tweet = json.loads(data)
client = MongoClient('localhost', 27017)
db = client['twitter_db']
collection = db['twitter_collection']
collection.insert_many(tweet)
return True
except BaseException, e:
print 'failed ondata,', str(e)
time.sleep(5)
pass
exit()
def on_error(self, status):
print statuses
You are using the wrong method to insert the document into your collection. In your case, json.loads
returns a dictionary not a list
thus you need to use the insert_one
method to insert that single document because insert_many
only insert an iterable of documents.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With