Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Correct way to insert many records into Mongodb with Node.js

Tags:

I was wondering what is the correct way to do bulk inserts into Mongodb (although could be any other database) with Node.js

I have written the following code as an example, although I believe it is floored as db.close() may be run before all the asynchronous collection.insert calls have completed.

MongoClient.connect('mongodb://127.0.0.1:27017/test', function (err, db) {
    var i, collection;
    if (err) {
        throw err;
    }
    collection = db.collection('entries');
    for (i = 0; i < entries.length; i++) {
        collection.insert(entries[i].entry);
    }
    db.close();
});
like image 350
Richard Hensman Avatar asked Dec 30 '15 13:12

Richard Hensman


People also ask

Which of the following is correct command to insert data into MongoDB?

The insert() Method To insert data into MongoDB collection, you need to use MongoDB's insert() or save() method.

Which method is used to insert data in MongoDB?

insert() In MongoDB, the insert() method inserts a document or documents into the collection. It takes two parameters, the first parameter is the document or array of the document that we want to insert and the remaining are optional. Using this method you can also create a collection by inserting documents.


2 Answers

If your MongoDB server is 2.6 or newer, it would be better to take advantage of using a write commands Bulk API that allow for the execution of bulk insert operations which are simply abstractions on top of the server to make it easy to build bulk operations and thus get perfomance gains with your update over large collections.

Sending the bulk insert operations in batches results in less traffic to the server and thus performs efficient wire transactions by not sending everything all in individual statements, but rather breaking up into manageable chunks for server commitment. There is also less time waiting for the response in the callback with this approach.

These bulk operations come mainly in two flavours:

  • Ordered bulk operations. These operations execute all the operation in order and error out on the first write error.
  • Unordered bulk operations. These operations execute all the operations in parallel and aggregates up all the errors. Unordered bulk operations do not guarantee order of execution.

Note, for older servers than 2.6 the API will downconvert the operations. However it's not possible to downconvert 100% so there might be some edge cases where it cannot correctly report the right numbers.

In your case, you could implement the Bulk API insert operation in batches of 1000 like this:

For MongoDB 3.2+ using bulkWrite

var MongoClient = require('mongodb').MongoClient;
var url = 'mongodb://localhost:27017/test';
var entries = [ ... ] // a huge array containing the entry objects

var createNewEntries = function(db, entries, callback) {

    // Get the collection and bulk api artefacts
    var collection = db.collection('entries'),          
        bulkUpdateOps = [];    

    entries.forEach(function(doc) {
        bulkUpdateOps.push({ "insertOne": { "document": doc } });

        if (bulkUpdateOps.length === 1000) {
            collection.bulkWrite(bulkUpdateOps).then(function(r) {
                // do something with result
            });
            bulkUpdateOps = [];
        }
    })

    if (bulkUpdateOps.length > 0) {
        collection.bulkWrite(bulkUpdateOps).then(function(r) {
            // do something with result
        });
    }
};

For MongoDB <3.2

var MongoClient = require('mongodb').MongoClient;
var url = 'mongodb://localhost:27017/test';
var entries = [ ... ] // a huge array containing the entry objects

var createNewEntries = function(db, entries, callback) {

    // Get the collection and bulk api artefacts
    var collection = db.collection('entries'),          
        bulk = collection.initializeOrderedBulkOp(), // Initialize the Ordered Batch
        counter = 0;    

    // Execute the forEach method, triggers for each entry in the array
    entries.forEach(function(obj) {         

        bulk.insert(obj);           
        counter++;

        if (counter % 1000 == 0 ) {
            // Execute the operation
            bulk.execute(function(err, result) {  
                // re-initialise batch operation           
                bulk = collection.initializeOrderedBulkOp();
                callback();
            });
        }
    });             

    if (counter % 1000 != 0 ){
        bulk.execute(function(err, result) {
            // do something with result 
            callback();             
        }); 
    } 
};

Call the createNewEntries() function.

MongoClient.connect(url, function(err, db) {
    createNewEntries(db, entries, function() {
        db.close();
    });
});
like image 62
chridam Avatar answered Sep 28 '22 05:09

chridam


You can use insertMany. It accepts an array of objects. Check the API.

like image 44
Arjan Frans Avatar answered Sep 28 '22 04:09

Arjan Frans