I have a collection with 9 million records. I am currently using the following script to update the entire collection:
simple_update.js
db.mydata.find().forEach(function(data) { db.mydata.update({_id:data._id},{$set:{pid:(2571 - data.Y + (data.X * 2572))}}); });
This is run from the command line as follows:
mongo my_test simple_update.js
So all I am doing is adding a new field pid based upon a simple calculation.
Is there a faster way? This takes a significant amount of time.
drop() will delete to whole collection (very fast) and all indexes on the collection.
MongoDB's update() and save() methods are used to update document into a collection.
There are two things that you can do.
That link also contains the following advice:
This is a good technique for performing batch administrative work. Run mongo on the server, connecting via the localhost interface. The connection is then very fast and low latency. This is friendlier than db.eval() as db.eval() blocks other operations.
This is probably the fastest you'll get. You have to realize that issuing 9M updates on a single server is going to be a heavy operation. Let's say that you could get 3k updates / second, you're still talking about running for nearly an hour.
And that's not really a "mongo problem", that's going to be a hardware limitation.
I am using the: db.collection.update method
// db.collection.update( criteria, objNew, upsert, multi ) // --> for reference db.collection.update( { "_id" : { $exists : true } }, objNew, upsert, true);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With