Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are inserts slow in the 2.6 MongoDB shell compared to previous versions?

Tags:

mongodb

When inserting test data into MongoDB, I usually just use a for loop to do a large number of single inserts. With 2.4 and below, this is pretty fast (~2 seconds), for example:

> db.timecheck.drop();
true
> start = new Date(); for(var i = 0; i < 100000; i++){db.timecheck.insert({"_id" : i})}; end = new Date(); print(end - start);
2246

Trying the same thing with 2.6 is significantly slower (~37 seconds):

> db.timecheck.drop();
true
> start = new Date(); for(var i = 0; i < 100000; i++){db.timecheck.insert({"_id" : i})}; end = new Date(); print(end - start);
37169

That is much, much slower. So, why is there such a difference with the new version and how can I fix it?

like image 256
Adam Comerford Avatar asked Mar 28 '14 17:03

Adam Comerford


1 Answers

Before 2.6 the interactive shell would run through the loop and only check the success (using getLastError) of the last operation in the loop (more specifically, it called getLastError after each carriage return, with the last operation being the last insert in the loop). With 2.6, the shell will now check on the status of each individual operation within the loop. Essentially that means that the "slowness" with 2.6 can be attributed to acknowledged versus unacknowledged write performance rather than an actual performance issue per se.

Acknowledged writes have been the default for some time now, and so I think the behavior in the 2.6 is more correct, though a little inconvenient for those of us used to the original behavior.

To get back to your previous levels of performance the answer is to use the new unordered bulk insert API. Here's a timed version:

> db.timecheck.drop();
true
> var bulk = db.timecheck.initializeUnorderedBulkOp(); start = new Date(); for(var i = 0; i < 100000; i++){bulk.insert({"_id" : i})}; bulk.execute({w:1}); end = new Date(); print(end - start);
2246

That's now back to essentially the same performance at just over 2 seconds. Sure, it’s a little more bulky (pardon the pun), but you know exactly what you are getting, which I think is a good thing in general. There is also an upside here, when you are not looking for timing information. Let’s get rid of that and run the insert again:

> db.timecheck.drop();
true
> var bulk = db.timecheck.initializeUnorderedBulkOp(); for(var i = 0; i < 100000; i++){bulk.insert({"_id" : i})}; bulk.execute({w:1});
BulkWriteResult({
"writeErrors" : [ ],
"writeConcernErrors" : [ ],
"nInserted" : 100000,
"nUpserted" : 0,
"nMatched" : 0,
"nModified" : 0,
"nRemoved" : 0,
"upserted" : [ ]
})

Now we get a nice result document when we do the bulk insert, rather than a check on just the last operations (all the rest in the 2.4 version were essentially send and forget). Because it is an unordered bulk operation, it will continue should it encounter an error and report on each such error in this document. There are none to be seen in the example above, but it's easy to artificially create a failure scenario. Let's just pre-insert a value we know will come up and hence cause a duplicate key error on the (default) unique _id index:

> db.timecheck.drop();
true
> db.timecheck.insert({_id : 500})
WriteResult({ "nInserted" : 1 })
> var bulk = db.timecheck.initializeUnorderedBulkOp(); for(var i = 0; i < 100000; i++){bulk.insert({"_id" : i})}; bulk.execute({w:1});
2014-03-28T16:19:40.923+0000 BulkWriteError({
"writeErrors" : [
{
"index" : 500,
"code" : 11000,
"errmsg" : "insertDocument :: caused by :: 11000 E11000 duplicate key error index: test.timecheck.$_id_ dup key: { : 500.0 }",
"op" : {
"_id" : 500
}
}
],
"writeConcernErrors" : [ ],
"nInserted" : 99999,
"nUpserted" : 0,
"nMatched" : 0,
"nModified" : 0,
"nRemoved" : 0,
"upserted" : [ ]
})

Now we can see how many were successful, which one failed (and why). It may be a little more complicated to set up, but overall I think it's an improvement.

With all of that said, and the new preferred way outlined, there is a way to force the shell back to legacy mode. This makes sense, since a 2.6 shell might have to connect to, and work with, older servers. If you connect to a 2.4 server, this will be taken care of for you, but to force the matter for a particular connection you can run:

db.getMongo().forceWriteMode("legacy");

Once you are done, you can revert back to the 2.6 version with:

db1.getMongo().forceWriteMode("commands");

For actual usage, see my crud.js snippet. This works for now, but may be removed without notice at any point in the future and is really not intended for extensive use, so use at your own risk.

like image 115
Adam Comerford Avatar answered Sep 20 '22 12:09

Adam Comerford