I was reading about mongodb. Came across this part http://www.mongodb.org/display/DOCS/Tutorial It says -
> var cursor = db.things.find();
> printjson(cursor[4]);
{ "_id" : ObjectId("4c220a42f3924d31102bd858"), "x" : 4, "j" : 3 }
"When using a cursor this way, note that all values up to the highest accessed (cursor[4] above) are loaded into RAM at the same time. This is inappropriate for large result sets, as you will run out of memory. Cursors should be used as an iterator with any query which returns a large number of elements."
How to use cursor as iterator with a query?Thanks for the help
You've tagged that you're using pymongo, so I'll give you two pymongo examples using the cursor as an iterator:
import pymongo
cursor = pymongo.Connection().test_db.test_collection.find()
for item in cursor:
print item
#this will print the item as a dictionary
and
import pymongo
cursor = pymongo.Connection().test_db.test_collection.find()
results = [item['some_attribute'] for item in cursor]
#this will create a list comprehension containing the value of some_attribute
#for each item in the collection
In addition, you can set the size of batches returned to the pymongo driver by doing this:
import pymongo
cursor = pymongo.Connection().test_db.test_collection.find()
cursor.batchsize(20) #sets the size of batches of items the cursor will return to 20
It is usually unnecessary to mess with the batch size, but if the machine you are running the driver on is having memory issues and page faulting while you are manipulating results from the query, you might have to set this to achieve better performance (this really seems like a painful optimization to me and I've always left the default).
As far as the javascript driver (the driver that loads when you launch the "shell") that part of the documentation is cautioning you not to use "array mode". From the online manual:
Array Mode in the Shell
Note that in some languages, like JavaScript, the driver supports an "array mode". Please check your driver documentation for specifics.
In the db shell, to use the cursor in array mode, use array index [] operations and the length property.
Array mode will load all data into RAM up to the highest index requested. Thus it should not be used for any query which can return very large amounts of data: you will run out of memory on the client.
You may also call toArray() on a cursor. toArray() will load all objects queries into RAM.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With