Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to improve MongoDB insert performance

Tags:

The result:

If you are operating on a dataset that is fault tolerant, or doing a one time process you can verify, changing WriteAcknowledge to Unacknowledged can help.

Also, bulk operations are IsOrdered by default, which I was not aware off. Setting this to False actually makes the operation perform in bulk, otherwise it operates as one thread of updates.


MongoDB 3.0 / WiredTiger / C# Driver

I have a collection with 147,000,000 documents, of which I am performing updates each second (hopefully) of approx. 3000 documents.

Here is an example update:

"query" : {     "_id" : BinData(0,"UKnZwG54kOpT4q9CVWbf4zvdU223lrE5w/uIzXZcObQiAAAA") }, "updateobj" : {     "$set" : {         "b" : BinData(0,"D8u1Sk/fDES4IkipZzme7j2qJ4oWjlT3hvLiAilcIhU="),         "s" : true     } } 

This is a typical update of which I my requirements are to be inserted at a rate of 3000 per second.

Unfortunately these are taking twice as long, for instance the last update was for 1723 documents, and took 1061ms.

The collection only has an index on the _id, no other indexes, and the average document size for the collection is 244 bytes, uncapped.

The server has 64GB of memory, 12 threads. Insert performance is excellent with lower collection sizes, say around 50 million, but after about 80 million really starts to drop off.

Could it be because the entire set does not sit in memory? Database is backed by RAID0 SSDs so IO performance should not become a bottleneck and if it was it should have shown this at the beginning?

Would appreciate some guidance as I'm confident MongoDB can fulfill my rather meager requirements compared to some applications it is used in. There is not a substantial read rate on the database so Sharding would not improve matters, although perhaps I am wrong.

Either way, the current insert rate is not good enough.

Update: Here is the explain() of just the query...

"queryPlanner" : {     "plannerVersion" : 1,     "namespace" : "Collection",     "indexFilterSet" : false,     "parsedQuery" : {         "_id" : {             "$eq" : { "$binary" : "SxHHwTMEaOmSc9dD4ng/7ILty0Zu0qX38V81osVqWkAAAAAA", "$type" : "00" }         }     },     "winningPlan" : {         "stage" : "IDHACK"     },     "rejectedPlans" : [] }, "executionStats" : {     "executionSuccess" : true,     "nReturned" : 1,     "executionTimeMillis" : 1,     "totalKeysExamined" : 1,     "totalDocsExamined" : 1,     "executionStages" : {         "stage" : "IDHACK",         "nReturned" : 1,         "executionTimeMillisEstimate" : 0,         "works" : 2,         "advanced" : 1,         "needTime" : 0,         "needFetch" : 0,         "saveState" : 0,         "restoreState" : 0,         "isEOF" : 1,         "invalidates" : 0,         "keysExamined" : 1,         "docsExamined" : 1     },     "allPlansExecution" : [] }, 

The query it self is very fast, and the update operation takes about 25ish milliseconds, they are being pushed to Mongo by use of the BulkWriter: await m_Collection.BulkWriteAsync(updates);

like image 440
James Avatar asked Jul 08 '15 16:07

James


People also ask

Do indexes slow down inserts in MongoDB?

If every “the” and every “and” is indexed, this slows down your MongoDB instances because each time data is inserted into results, the indexes are updated. Also, while it's better than having no index at all, it can take longer for MongoDB to find what you're looking for and then everything slows down.

How many writes per second can MongoDB handle?

Performance Scale After evaluating multiple technology options, AHL used MongoDB to replace its relational and specialised 'tick' databases. MongoDB supports 250 million ticks per second, at 40x lower cost than the legacy technologies it replaced.

Does MongoDB provide high performance?

MongoDB is the premier NoSQL document database for modern developers working on high-performance applications. With its JSON-like documents, MongoDB is notable for horizontal scaling and load balancing, which offers developers an excellent balance of customization and scalability.


1 Answers

You can try to modify the Write concern levels. Obviously there is a risk on this, as you wouldn't be able to catch any writing error, but at least you should still be able to capture network errors. As MongoDB groups the bulk insert operations in groups of 1000, this should speed up the process.

W by default is 1:

enter image description here

When you change it to 0:

enter image description here

If you are not concern about the order of elements, you can gain some speed calling the unordered bulk operation

await m_Collection.BulkWriteAsync(updates, new BulkWriteOptions() { IsOrdered = false }); 

With an unordered operations list, MongoDB can execute in parallel the write operations in the list and in any order. Link

like image 198
D.Rosado Avatar answered Sep 28 '22 11:09

D.Rosado