I have a collection with 500K+ documents which is stored on a single node mongo. Every now and then my pymongo cursor.find() fails as it times out.
While I could set the find
to ignore timeout, I do not like that approach. Instead, I tried a generator (adapted from this answer and this link):
def mongo_iterator(self, cursor, limit=1000):
skip = 0
while True:
results = cursor.find({}).sort("signature", 1).skip(skip).limit(limit)
try:
results.next()
except StopIteration:
break
for result in results:
yield result
skip += limit
I then call this method using:
ref_results_iter = self.mongo_iterator(cursor=latest_rents_refs, limit=50000)
for ref in ref_results_iter:
results_latest1.append(ref)
The problem: My iterator does not return the same number of results. The issue is that next() advances the cursor. So for every call I lose one element...
The question: Is there a way to adapt this code so that I can check if next exists? Pymongo 3x does not provide hasNext() and 'alive' check is not guaranteed to return false.
The .find()
method takes additional keyword arguments. One of them is no_cursor_timeout
which you need to set to True
cursor = collection.find({}, no_cursor_timeout=True)
You don't need to write your own generator function. The find()
method returns a generator like object.
Why not use
for result in results:
yield result
The for loop should handle StopIteration
for you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With