Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to reach the same performance with the C# mongo driver than PyMongo in python?

Tags:

python

c#

mongodb

In a current benchmark about mongodb drivers, we have noticed a huge difference in performance between python and .Net (core or framework).

And the a part of the difference can be explained by this in my opinion.

We obtained the following results :


┏━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━┓
┃ Metric                  ┃ Csharp   ┃ Python  ┃ ratio p/c ┃
┣━━━━━━━━━━━━━━━━━━━━━━━━━╋━━━━━━━━━━╋━━━━━━━━━╋━━━━━━━━━━━┫
┃ Ratio Duration/Document ┃ 24.82    ┃ 0.03    ┃ 0.001     ┃
┃ Duration (ms)           ┃ 49 638   ┃ 20 016  ┃ 0.40      ┃
┃ Count                   ┃ 2000     ┃ 671 972 ┃ 336       ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━┻━━━━━━━━━━┻━━━━━━━━━┻━━━━━━━━━━━┛

We took a look to the memory allocation in C# and we noticed a ping pong between download phases of a BsonChunck and deserialization. (Normal as it's by batch.) But the download phases were very long. So we took a look to the network trace of the different queries as mongo use TCP/IP:

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Metric                        ┃ Csharp    ┃ Python     ┃ ratio p/c ┃
┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╋━━━━━━━━━━━╋━━━━━━━━━━━━╋━━━━━━━━━━━┫
┃ Packets/sec to db             ┃ 30        ┃ 160        ┃ 5.3       ┃
┃ Packets/sec from DB           ┃ 120 - 150 ┃ 750 - 1050 ┃ ~6.5      ┃
┃ Packet count to db            ┃ 1560      ┃ 2870       ┃ 1.84      ┃
┃ Packet count from db          ┃ 7935      ┃ 13663      ┃ 1.7       ┃
┃ Packet average length to db   ┃ 73.6      ┃ 57.6       ┃ 0.74      ┃
┃ Packet average length from db ┃ 1494      ┃ 1513       ┃ 1.01      ┃
┃ Max TCP Errors/sec            ┃ 20        ┃ 170        ┃ 8.5       ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┻━━━━━━━━━━━┻━━━━━━━━━━━━┻━━━━━━━━━━━┛

For the config file, the result is stunning only with the following one :

{
    "connectionString": "mongodb://ip.of.the.mongo:27018",
    "dbname" : "mydb",
    "colname" : "mycollection",
    "query" : {},
    "projection" :{},
    "limit" : 2000,
    "batchsize": 10
}

The latency for one object is impressive : 0.03 ms for python and 24.82 for csharp.

Do you have some insights about this difference ? Do you know a way to reach the same performance in C# than in Python ? Thank you in advance :-)

To do the benchmark, we are using these two codes :

Python (pymongo driver):

#!/usr/bin/env python3

import pymongo
import time
import json
import os

queries_dir = "../queries"
results_dir = "../results"

for subdir, dirs, files in os.walk(queries_dir):
    for f in files:
        filepath = subdir + os.sep + f
        print(filepath)
        conf = json.load(open(filepath))
        conf["language"] = "python"
        client = pymongo.MongoClient(conf["connectionString"])
        db = client[conf["dbname"]]
        col = db[conf["colname"]]

        initConnection = col.find({}, {}).limit(1)
        for element in initConnection:
            print(element)

        input("Press enter to continue.")

        res = col.find(conf["query"], conf["projection"])
        returned = 0

        start = time.time()
        for i in res:
            returned += 1
        duration = (time.time() - start) * 1000
        conf["duration"] = duration
        conf["returned"] = returned
        conf["duration_per_returned"] = float(duration) / float(returned) 
        d = time.strftime("%Y-%m-%d_%H-%M-%S")
        fr = open(results_dir + os.sep + d + "_" + conf["language"] + "_" + f,"w")
        json.dump(conf, fr, indent=4, sort_keys=True)
        fr.close()
        print(json.dumps(conf,indent=4, sort_keys=True))

And for .Net (MongoDB.Driver):

class Program
    {
        static void Main(string[] args)
        {
            var dir = Directory.GetCurrentDirectory();
            var queryDirectory = dir.Replace(@"csharp\benchmark\benchmark\bin\Debug\netcoreapp2.2", string.Empty) + "queries"; 
            var resultDirectory = dir.Replace(@"csharp\benchmark\benchmark\bin\Debug\netcoreapp2.2", string.Empty)+ "results";
            var configurationFiles = Directory.GetFiles(queryDirectory);
            foreach (var file in configurationFiles)
            {
                var configuration = JsonConvert.DeserializeObject<BenchmarkConfiguration>(File.ReadAllText(file));
                var collection = new MongoClient(configuration.ConnectionString)
                    .GetDatabase(configuration.Database)
                    .GetCollection<BsonDocument>(configuration.Collection);
                var filters = BsonDocument.Parse((string)(configuration.Query.ToString()));
                var projection = BsonDocument.Parse((string)(configuration.Projection.ToString()));
                var query = collection.Find(filters, new FindOptions { BatchSize = configuration.BatchSize }).Project(projection).Limit(configuration.Limit);

                var initconnection = collection.Find(new BsonDocument { }).Limit(1).FirstOrDefault();
                Console.WriteLine(initconnection.ToString());
                Console.WriteLine("Press Enter to continue.");
                Console.ReadLine();

                var watch = new Stopwatch();
                watch.Start();
                var results = query.ToList();
                watch.Stop();
                var time = watch.ElapsedMilliseconds;
                var now = DateTime.Now.ToString("yyyy-MM-dd_hh-mm-ss");
                var report = new BenchmarkResult(configuration, time, results.Count());
                File.WriteAllText($"{resultDirectory}/{now}_csharp_{Path.GetFileName(file)}", JsonConvert.SerializeObject(report, Formatting.Indented));
            }
        }
    }

OLD METRICS : Before removing the connection pull from the benchmark loop.


┏━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━┓
┃ Metric                  ┃ Csharp   ┃ Python  ┃ ratio p/c ┃
┣━━━━━━━━━━━━━━━━━━━━━━━━━╋━━━━━━━━━━╋━━━━━━━━━╋━━━━━━━━━━━┫
┃ Ratio Duration/Document ┃ 26.07    ┃ 0.06    ┃ 0.002     ┃
┃ Duration (ms)           ┃ 52 145.0 ┃ 41 981  ┃ 0.80      ┃
┃ Count                   ┃ 2000     ┃ 671 972 ┃ 336       ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━┻━━━━━━━━━━┻━━━━━━━━━┻━━━━━━━━━━━┛
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Metric                        ┃ Csharp    ┃ Python     ┃ ratio p/c ┃
┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╋━━━━━━━━━━━╋━━━━━━━━━━━━╋━━━━━━━━━━━┫
┃ Packets/sec to db             ┃ 30        ┃ 150        ┃ 5         ┃
┃ Packets/sec from DB           ┃ 120 - 180 ┃ 750 - 1050 ┃ ~6        ┃
┃ Packet count to db            ┃ 1540      ┃ 2815       ┃ 1.8       ┃
┃ Packet count from db          ┃ 7946      ┃ 13700      ┃ 1.7       ┃
┃ Packet average length to db   ┃ 74        ┃ 59         ┃ 0.80      ┃
┃ Packet average length from db ┃ 1493      ┃ 1512       ┃ 1         ┃
┃ Max TCP Errors/sec            ┃ 10        ┃ 320        ┃ 32        ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┻━━━━━━━━━━━┻━━━━━━━━━━━━┻━━━━━━━━━━━┛
like image 818
Balbereith Avatar asked Aug 23 '19 07:08

Balbereith


2 Answers

Projection issue with pymongo

The performance difference between C# and python driver you are experimenting is due to a very tiny detail that lay down in how python pymongo lib understands the empty { } projection.

In mongodb-shell and in the C# driver, { } projection does return the whole document. There is actually no projection at all. It's likely to be considered as the default behavior since mongodb-shell is acting like that.

However, { } projection in python with pymongo does only return the _id field !
The more the documents are heavy, the more time it takes your C# code to download the documents, while your python code is only taking the very tiny _ids.

Fixing your bench code

If you change from

res = col.find(conf["query"], conf["projection"]) #projection being { }.
for i in res:
    print(i) # Only print Id fields.
    returned += 1

to

res = col.find(conf["query"])
for i in res:
    print(i) # Whole document 'downloaded' and printed.
    returned += 1

You will notice that both are performing well.

like image 197
Joris La Cancellera Avatar answered Sep 28 '22 20:09

Joris La Cancellera


After taking a (fast) look to mongo-csharp-driver source, I could say that:

  • Connection pooling is mostly cluster related

For these point, here we have:

  • ExclusiveConnectionPool based on ConnectionPoolSettings who all default values are Optional<> or aka default + default hardcoded value.
  • This are called through ExclusiveConnectionPoolFactory > Server > ClusterBuilder > ServerFactory > Cluster (or any IClusterableServerFactory?) who end at SingleServerCluster or MultiServerCluster

Looking fastly through mongo-python-driver, pooling is not applied in the same way/context. It's applied for direct-server connection, without any need of cluster conf.

Regarding to the doc, just adding minPoolSize & maxPoolSize will do the job/trick.

Like $"{yourconnectionstring}/?appname=pooledClient&minPoolSize=5&maxPoolSize=100

mongod logs

The question to ask (to the team/maintainers) is: Why is pooling not enabled by default, for non-cluster configuration?


About the performance for single vs multi vs pooled MongoClient in 2.9.1

  • ReadMultiShared: 4 clients, in a Parallel.ForEach, with skip+limit
  • ReadMultiSharedAsync: 2 clients, async call, with skip+limit
  • ReadAllSingle: 1 client
  • ReadAllPooled: 1 client with pool

Mongod running in a local Docker.

// * Summary *

BenchmarkDotNet=v0.11.5, OS=Windows 10.0.18362
Intel Core i7-4790 CPU 3.60GHz (Haswell), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.0.100-preview8-013656
  [Host]     : .NET Core 2.2.2 (CoreCLR 4.6.27317.07, CoreFX 4.6.27318.02), 64bit RyuJIT
  Job-VBWGDB : .NET Core 2.2.2 (CoreCLR 4.6.27317.07, CoreFX 4.6.27318.02), 64bit RyuJIT

Toolchain=.NET Core 2.2

|               Method |         Mean |        Error |       StdDev | Rank |     Gen 0 |     Gen 1 | Gen 2 |   Allocated |
|--------------------- |-------------:|-------------:|-------------:|-----:|----------:|----------:|------:|------------:|
|          BsonReadOne |     806.5 us |     15.94 us |     23.37 us |    1 |    4.8828 |         - |     - |    21.15 KB |
|    BasicClassReadOne |     827.9 us |     16.34 us |     33.75 us |    1 |    3.9063 |         - |     - |    19.99 KB |
|      ReadMultiShared | 531,369.5 us | 10,611.11 us | 26,029.23 us |    3 | 6000.0000 | 2000.0000 |     - | 30080.81 KB |
| ReadMultiSharedAsync | 644,437.8 us | 12,396.64 us | 11,595.83 us |    4 | 7000.0000 | 2000.0000 |     - |    93.48 KB |
|        ReadAllSingle | 512,507.9 us | 10,172.46 us | 11,306.67 us |    2 | 6000.0000 | 2000.0000 |     - | 39375.26 KB |
|        ReadAllPooled | 513,240.1 us | 10,423.18 us | 18,255.36 us |    2 | 6000.0000 | 2000.0000 |     - | 39375.26 KB |

src

Edit: using BatchSize seem really to impact the perf. of queries too. Using batching, ReadMultiShared perform better against a client with it. Without it, it underperforms.

like image 33
Herve-M Avatar answered Sep 28 '22 18:09

Herve-M