Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read through collection in chunks by 1000?

I need to read whole collection from MongoDB ( collection name is "test" ) in Python code. I tried like

    self.__connection__ = Connection('localhost',27017)
    dbh = self.__connection__['test_db']            
    collection = dbh['test']

How to read through collection in chunks by 1000 ( to avoid memory overflow because collection can be very large ) ?

like image 644
Damir Avatar asked Mar 20 '12 12:03

Damir


People also ask

How do I view nested documents in MongoDB?

In MongoDB, you can access the fields of nested/embedded documents of the collection using dot notation and when you are using dot notation, then the field and the nested field must be inside the quotation marks. Document: three documents that contain the details of the students in the form of field-value pairs.

How do I query an inner document in MongoDB?

To specify a query condition on fields in an embedded/nested document, use dot notation ( "field. nestedField" ).

What is batch size in MongoDB?

It is however very inefficient to retrieve documents one at a time from the server. Batch size is how many documents the driver requests from the server at once.

What is chunk in MongoDB?

A chunk consists of a subset of sharded data. Each chunk has a inclusive lower and exclusive upper range based on the shard key. click to enlarge. MongoDB splits chunks when they grow beyond the configured chunk size. Both inserts and updates can trigger a chunk split.


1 Answers

I agree with Remon, but you mention batches of 1000, which his answer doesn't really cover. You can set a batch size on the cursor:

cursor.batch_size(1000);

You can also skip records, e.g.:

cursor.skip(4000);

Is this what you're looking for? This is effectively a pagination pattern. However, if you're just trying to avoid memory exhaustion then you don't really need to set batch size or skip.

like image 187
Mick Sear Avatar answered Sep 19 '22 05:09

Mick Sear