Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bulk operation by mongoose

I want to store bulk data (more than 1000 or 10000 records) in a single operation by MongoOSE. But MongoOSE does not support bulk operations so I will use the native driver (MongoDB, for insertion). I know that I will bypass all MongoOSE middlewares but its ok. (Please correct me If I am wrong! :) )

I have an option to store data by insert method. But MongoDB also provides Bulk class (ordered and unordered operations). Now I have the following questions:

  • Difference between insert and bulk operation (both can store bulk data) ?
  • Any specific difference between initializeUnorderedBulkOp() (performs operation in serially) and initializeOrderedBulkOp() (performs operations in parallel) ?
  • If I will use initializeUnorderedBulkOp then it will effect on by range search or any side-effects ?
  • Can I do it by Promisification (by BlueBird) ?? (I am trying to do it.)

Thanks

EDIT: I am talking about bulk vs insert regarding to multiple insertions. Which one is better? Insertion one by one by bulk builder OR insertion by batches (1000) in insert method. I hope now it will clear Mongoose (mongodb) batch insert? this link

like image 581
Manish Trivedi Avatar asked Jul 09 '15 06:07

Manish Trivedi


1 Answers

If you are calling this from a mongoose model you need the .collection accessor

var bulk = Model.collection.initializeOrderedBulkOp();

// examples
bulk.insert({ "a": 1 });
bulk.find({ "a": 1 }).updateOne({ "$set": { "a": 2 } });

bulk.execute(function(err,result) {
   // result contains stats of the operations
});

You need to be "careful" when doing this though. Apart from not being bound to the same checks and validation that can be attached to mongoose schemas, when you call .collection you need to be "sure" that the connection to the database has already been made. Mongoose methods look after this for you, but once you use the underlying driver methods you are all on your own.

As for diffferences it's all there in the naming:

  • Ordered: Means that the batched instructions are executed in the same order they are added. They execute one after the other in sequence and one at a time. If an error occurs at any point, the execution of the batch is halted and the error response returned. All operations up until then are "comitted". This is not a rollback.

  • UnOrdered: Means that batched operations can execute in "any" sequence and often in parallel. This can lead to faster updates, but of course cannot be used where one bulk operation in the batch is meant to occur before another ( example above ). Any errors that occur are merely "reported" in the result, and the whole batch will complete as sent to the server.

Of course the core difference for either type of execution from the standard methods is that the "whole batch" ( actually in lots of 1000 maximum ) is sent to the server and you only get one response back. This saves network traffic and waiting for each idividual .insert() or other like operation to complete.

As for can a "promise" be used, well anything else with a callback that you can convert to returning a promise follows the same rules as here. Remember though that the "callback/promise" is on the .execute() method, and that what you get back complies to the rules of what is returned from Bulk operations results.

For more information see "Bulk" in the core documentation.

like image 126
Blakes Seven Avatar answered Oct 10 '22 00:10

Blakes Seven