How to read through collection in chunks by 1000?

Tags:

I need to read whole collection from MongoDB ( collection name is "test" ) in Python code. I tried like

    self.__connection__ = Connection('localhost',27017)
    dbh = self.__connection__['test_db']            
    collection = dbh['test']

How to read through collection in chunks by 1000 ( to avoid memory overflow because collection can be very large ) ?

644

asked Mar 20 '12 12:03

Damir

1 Answers

I agree with Remon, but you mention batches of 1000, which his answer doesn't really cover. You can set a batch size on the cursor:

cursor.batch_size(1000);

You can also skip records, e.g.:

cursor.skip(4000);

Is this what you're looking for? This is effectively a pagination pattern. However, if you're just trying to avoid memory exhaustion then you don't really need to set batch size or skip.

187

answered Sep 19 '22 05:09

Mick Sear

Related questions
                            
                                Pygments in QScintilla
                            
                                Converting a text document with special format to Pandas DataFrame
                            
                                Visual studio code breakpoint set to grey color & not working.error(may be excluded because of "justMyCode" option)
                            
                                Set default python with pyenv
                            
                                Python and MySQL: is there an alternative to MySQLdb?
                            
                                Python - simple reading lines from a pipe
                            
                                How to pass SOAP headers into python SUDS that are not defined in WSDL file
                            
                                Django templates: False vs. None
                            
                                How to handle empty values in config files with ConfigParser?
                            
                                Is it possible to use two Python packages with the same name?
                            
                                django-mptt get_descendants for a list of nodes
                            
                                Querying a many-to-many relationship in SQLAlchemy
                            
                                Recent-ish changes to the Python execution model?
                            
                                Python: handle broken unicode bytes when parsing JSON string
                            
                                pip not working
                            
                                python how to convert from string.template object to string
                            
                                Does passing reverse=True when sorting a list in Python affect efficiency?
                            
                                How to write to a file using non blocking IO?
                            
                                Python objects - avoiding creation of attribute with unknown name
                            
                                Shape recognition with numpy/scipy (perhaps watershed)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to read through collection in chunks by 1000?

Tags:

python

mongodb

pymongo

Damir

People also ask

1 Answers

Mick Sear

Recent Activity

Donate For Us