<p>I have a huge collection of documents in my DB and I'm wondering how can I run through all the documents and update them, each document with a different value.</p>

<p>The answer depends on the driver you're using. All MongoDB drivers I know have <code>cursor.forEach()</code> implemented one way or another.</p> <p>Here are some examples:</p> <h3>node-mongodb-native</h3> <pre class="prettyprint"><code>collection.find(query).forEach(function(doc) { // handle }, function(err) { // done or error }); </code></pre> <h3>mongojs</h3> <pre class="prettyprint"><code>db.collection.find(query).forEach(function(err, doc) { // handle }); </code></pre> <h3>monk</h3> <pre class="prettyprint"><code>collection.find(query, { stream: true }) .each(function(doc){ // handle doc }) .error(function(err){ // handle error }) .success(function(){ // final callback }); </code></pre> <h3>mongoose</h3> <pre class="prettyprint"><code>collection.find(query).stream() .on('data', function(doc){ // handle doc }) .on('error', function(err){ // handle error }) .on('end', function(){ // final callback }); </code></pre> <h3>Updating documents inside of <code>.forEach</code> callback</h3> <p>The only problem with updating documents inside of <code>.forEach</code> callback is that you have no idea when all documents are updated.</p> <p>To solve this problem you should use some asynchronous control flow solution. Here are some options:</p> <ul> <li>async</li> <li>promises (when.js, bluebird)</li> </ul> <p>Here is an example of using <code>async</code>, using its <code>queue</code> feature:</p> <pre class="prettyprint"><code>var q = async.queue(function (doc, callback) { // code for your update collection.update({ _id: doc._id }, { $set: {hi: 'there'} }, { w: 1 }, callback); }, Infinity); var cursor = collection.find(query); cursor.each(function(err, doc) { if (err) throw err; if (doc) q.push(doc); // dispatching doc to async.queue }); q.drain = function() { if (cursor.isClosed()) { console.log('all items have been processed'); db.close(); } } </code></pre>

<p>Using the <code>mongodb</code> driver, and modern NodeJS with async/await, a good solution is to use <code>next()</code>:</p> <pre class="prettyprint lang-js prettyprint-override"><code>const collection = db.collection('things') const cursor = collection.find({ bla: 42 // find all things where bla is 42 }); let document; while ((document = await cursor.next())) { await collection.findOneAndUpdate({ _id: document._id }, { $set: { blu: 43 } }); } </code></pre> <p>This results in only one document at a time being required in memory, as opposed to e.g. the accepted answer, where many documents get sucked into memory, before processing of the documents starts. In cases of "huge collections" (as per the question) this may be important.</p> <p>If documents are large, this can be improved further by using a projection, so that only those fields of documents that are required are fetched from the database.</p>

How can I use a cursor.forEach() in MongoDB using Node.js?

2 Answers

The answer depends on the driver you're using. All MongoDB drivers I know have cursor.forEach() implemented one way or another.

Here are some examples:

node-mongodb-native

collection.find(query).forEach(function(doc) {   // handle }, function(err) {   // done or error });

mongojs

db.collection.find(query).forEach(function(err, doc) {   // handle });

monk

collection.find(query, { stream: true })   .each(function(doc){     // handle doc   })   .error(function(err){     // handle error   })   .success(function(){     // final callback   });

mongoose

collection.find(query).stream()   .on('data', function(doc){     // handle doc   })   .on('error', function(err){     // handle error   })   .on('end', function(){     // final callback   });

Updating documents inside of `.forEach` callback

The only problem with updating documents inside of .forEach callback is that you have no idea when all documents are updated.

To solve this problem you should use some asynchronous control flow solution. Here are some options:

async
promises (when.js, bluebird)

Here is an example of using async, using its queue feature:

var q = async.queue(function (doc, callback) {   // code for your update   collection.update({     _id: doc._id   }, {     $set: {hi: 'there'}   }, {     w: 1   }, callback); }, Infinity);  var cursor = collection.find(query); cursor.each(function(err, doc) {   if (err) throw err;   if (doc) q.push(doc); // dispatching doc to async.queue });  q.drain = function() {   if (cursor.isClosed()) {     console.log('all items have been processed');     db.close();   } }

140

answered Oct 13 '22 06:10

Leonid Beschastny

Using the mongodb driver, and modern NodeJS with async/await, a good solution is to use next():

const collection = db.collection('things') const cursor = collection.find({   bla: 42 // find all things where bla is 42 }); let document; while ((document = await cursor.next())) {   await collection.findOneAndUpdate({     _id: document._id   }, {     $set: {       blu: 43     }   }); }

This results in only one document at a time being required in memory, as opposed to e.g. the accepted answer, where many documents get sucked into memory, before processing of the documents starts. In cases of "huge collections" (as per the question) this may be important.

If documents are large, this can be improved further by using a projection, so that only those fields of documents that are required are fetched from the database.

answered Oct 13 '22 05:10

chris6953

Related questions
                            
                                Mongoose - Why we make "mongoose.Promise = global.Promise" when setting a mongoose module?
                            
                                SyntaxError: expected expression, got '<'
                            
                                Node.js "FATAL ERROR: JS Allocation failed - process out of memory" -- possible to get a stack trace?
                            
                                How to convert a string to ObjectId in nodejs mongodb native driver?
                            
                                recommended typescript config for node 8
                            
                                Load node.js module from string in memory
                            
                                Skip subsequent Mocha tests from spec if one fails
                            
                                Warning: Accessing non-existent property 'MongoError' of module exports inside circular dependency
                            
                                What's the purpose of gruntjs server task?
                            
                                Error: Cannot find module 'webpack/lib/node/NodeTemplatePlugin'
                            
                                Jest looping through dynamic test cases
                            
                                Node.js - asynchronous module loading
                            
                                How to force SSL / https in Express.js
                            
                                Electron.js How to minimize/close window to system tray and restore window back from tray?
                            
                                guid/uuid in Typescript Node.js app
                            
                                How to go back 1 folder level with __dirname?
                            
                                What is a concise way to create inline elements in Jade
                            
                                Rendering HTML in variable using Jade
                            
                                How to provide a mysql database connection in single file in nodejs
                            
                                What's the difference between Interceptor vs Middleware vs Filter in Nest.js?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How can I use a cursor.forEach() in MongoDB using Node.js?

Tags:

node.js

mongodb

mongoose

Alex Brodov

People also ask

2 Answers

node-mongodb-native

mongojs

monk

mongoose

Updating documents inside of `.forEach` callback

Leonid Beschastny

chris6953

Recent Activity

Donate For Us

How can I use a cursor.forEach() in MongoDB using Node.js?

Tags:

node.js

mongodb

mongoose

Alex Brodov

People also ask

2 Answers

node-mongodb-native

mongojs

monk

mongoose

Updating documents inside of .forEach callback

Leonid Beschastny

chris6953

Related questions

Recent Activity

Donate For Us

Updating documents inside of `.forEach` callback