MongoDB C# driver - Performance issues when updating many documents

Question

I need to update fields on 100,000+ documents about once each minute and I have found that my current code on my i7, 6GB RAM, SSD HD PC is not nearly efficient enough. As far as I understand you cannot do batch updates with the driver (I am running latest from via Nuget)

Here is the results (time to perform 25,000 updates) I obtained by running my code as below:

Starting with an empty collection = 152 seconds
Starting with a pre-filled collection = 151 seconds
Pre-filled collection with indexing = 6.5 seconds
Pre-filled with indexing and the FindOneAndReplaceAsync method = 20 seconds

As expected indexing works best and I am not sure why async is less efficient. When growing to over 100,000 updates/min in future even with indexing this way might become too slow.

Is this the expected behaviour for the FindOneAndReplaceAsync and if so is there another way to get better performance. Am I trying to do something with MongoDB which it is not designed for?

The code (MCVE ready):

public class A
{
    public A(string id)
    {
        customId = id;
        TimeStamp = DateTime.UtcNow;
    }

    [BsonId]
    [BsonIgnoreIfDefault]
    ObjectId Id { get; set; }
    public string customId { get; set; }
    public double val { get; set; }
    public DateTime TimeStamp { get; set; }
}

class Program
{
    static IMongoCollection<A> Coll = new MongoClient("mongodb://localhost").GetDatabase("Test").GetCollection<A>("A");
    static FindOneAndReplaceOptions<A,A> Options = new FindOneAndReplaceOptions<A, A> { IsUpsert = true, };

    static void SaveDoc(A doc)
    {            
        Coll.FindOneAndReplace(Builders<A>.Filter.Where(x => x.customId == doc.customId), doc, Options);
    }

    static void Main(string[] args)
    {
        var docs = Enumerable.Range(0, 25000).Select(x => new A(x.ToString()));

        Stopwatch sw = new Stopwatch();
        sw.Start();
        docs.ToList().ForEach(x => SaveDoc(x));
        sw.Stop();

        Debug.WriteLine(sw.ElapsedMilliseconds);
    }
}

kreig · Accepted Answer

I think that the problem relates to the protocol and network latency. Each Update operation has serialization and transport penalty. You can use bulk writes in order to optimise batch operations performance.

In your case it will look like that:

//create container for bulk operations 
var operations = new List<WriteModel<BsonDocument>>();
//add batch tasks
operations.Add(new ReplaceOneModel<A>(new BsonDocument("customId", doc1.customId), doc1) { IsUpsert = true });
operations.Add(new ReplaceOneModel<A>(new BsonDocument("customId", doc2.customId), doc2) { IsUpsert = true });

//execute BulkWrite operation 
collection.BulkWrite(operations, new BulkWriteOptions() { BypassDocumentValidation = true, IsOrdered = false });

I would recommend to limit batch size to no more than 1000 docs for each BulkWrite opertion. MongoDb has limitation on BSON document size (16MB) and it may cause operation failure. Of course, batch size can be 10000 or even more for simple documents with a few fields.

BypassDocumentValidation and IsOrdered options can also significantly speed up the write process.

And one more thing...

You can use raw BsonDocuments instead of filter builders to eliminate LINQ selector processing and filter parsing penalty.

//instead of this
Builders<A>.Filter.Where(x => x.customId == doc.customId)
//you can use BsonDocument
new BsonDocument("customId", doc.customId)

Your filter builder will be serialized to exactly the same BSON doc before command execution.

MongoDB C# driver - Performance issues when updating many documents

Tags:

performance

c#

collections

mongodb

updates

Pierre

1 Answers

kreig

Recent Activity

Donate For Us

MongoDB C# driver - Performance issues when updating many documents

Tags:

performance

c#

collections

mongodb

updates

Pierre

1 Answers

kreig

Related questions

Recent Activity

Donate For Us