Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the optimum bulk item count with InsertBatch method in mongodb c# driver?

I heard that large batch sizes don't really give any additional performance

what is the optimum?

like image 215
Serdar Avatar asked Apr 17 '13 06:04

Serdar


People also ask

What is bulk insert in MongoDB?

Multiple documents can be inserted at a time in MongoDB using bulk insert operation where an array of documents is passed to the insert method as parameter.

What is the principal implication of using a bulk write operation in MongoDB?

bulkWrite() method provides the ability to perform bulk insert, update, and delete operations. MongoDB also supports bulk insert through the db. collection.

How do I add an array of documents in MongoDB?

A document in MongoDB is a data structure with JSON-like objects having field and value pairs. In order to insert documents into a MongoDB collection, we can use different methods such as insert(), insertOne() and insertMany().


1 Answers

If you call Insert to insert documents one at a time there is a network round trip for each document. If you call InsertBatch to insert documents in batches there is a network round trip for each batch instead of for each document. InsertBatch is more efficient than Insert because it reduces the number of network round trips.

Suppose you had to insert 1,000,000 documents, you could analyze the number of network round trips for different batch sizes:

  • batch size 1: 1,000,000 round trips
  • batch size 10: 100,000 round trips
  • batch size 100: 10,000 round trips
  • batch size 1000, 1000 round trips
  • etc...

So you see that even a batch size as small as 10 has already eliminated 90% of the network round trips, and a batch size of 100 has eliminated 99% of the network round trips.

This is a somewhat simplified analysis because it ignores the fact that as the batch sizes increase so do the message sizes, but it's more or less accurate.

I don't think that there is any one optimum batch size. I would say that larger batches are more performant, but once you have 10-100 documents per batch there will be very small performance improvements with larger batches.

like image 153
Robert Stam Avatar answered Nov 01 '22 20:11

Robert Stam