I did a small benchmark test to compare Couchbase (running in Win) with Redis and MySql (EDIT: added Aerospike to test)
We are inserting 100 000 JSON "documents" into three db/stores:
The JSON file is 67 lines, about 1800 bytes.
INSERT:
READ: We are reading 1000 times, and we do this 10 times and look at averages.
Conclusion: Couchbase seems slowest (the INSERT times varies a lot it seems), Aerospike is also very slow. Both of these are using in-memory storage (Couchbase => Ephemeral bucket, Aerospike => storage-engine memory).
Question: Why the in-memory write and read on Couchbase so slow, even slower than using normal MySQL (on an SSD)?
Note: Using Task.WhenAll, or awaiting each call, doesn't make a difference.
INSERT
Couchbase:
IBucket bucket = await cluster.BucketAsync("halo"); // <-- ephemeral 
IScope scope = bucket.Scope("myScope");
var collection = scope.Collection("myCollection");
// EDIT: Added this to avoid measuring lazy loading:
JObject t = JObject.FromObject(_baseJsonObject);
t["JobId"] = 0;
t["CustomerName"] = $"{firstnames[rand.Next(0, firstnames.Count - 1)]} {lastnames[rand.Next(0, lastnames.Count - 1)]}";
await collection.InsertAsync("0", t);
await collection.RemoveAsync("0");
List<Task> inserTasks = new List<Task>();
sw.Start();
foreach (JObject temp in jsonObjects) // jsonObjects is pre-created so its not a factor in the test
{
    inserTasks.Add(collection.InsertAsync(temp.GetValue("JobId").ToString(), temp));
}
await Task.WhenAll(inserTasks);
sw.Stop();
Console.WriteLine($"Adding {nbr} to Couchbase took {sw.ElapsedMilliseconds} ms");
Redis (using ServiceStack!)
sw.Restart();
using (var client = redisManager.GetClient())
{
    foreach (JObject temp in jsonObjects)
    {
        client.Set($"jobId:{temp.GetValue("JobId")}", temp.ToString());
    }
}
sw.Stop();
Console.WriteLine($"Adding {nbr} to Redis took {sw.ElapsedMilliseconds} ms");
sw.Reset();
Mysql:
MySql.Data.MySqlClient.MySqlConnection mySqlConnection = new MySql.Data.MySqlClient.MySqlConnection("Server=localhost;Database=test;port=3306;User Id=root;password=root;");
mySqlConnection.Open();
sw.Restart();
foreach (JObject temp in jsonObjects)
{
    MySql.Data.MySqlClient.MySqlCommand cmd = new MySql.Data.MySqlClient.MySqlCommand($"INSERT INTO test (id, data) VALUES ('{temp.GetValue("JobId")}', @data)", mySqlConnection);
    cmd.Parameters.AddWithValue("@data", temp.ToString());
    cmd.ExecuteNonQuery();
}
sw.Stop();
Console.WriteLine($"Adding {nbr} to MySql took {sw.ElapsedMilliseconds} ms");
sw.Reset();
READ
Couchbase:
IBucket bucket = await cluster.BucketAsync("halo");
IScope scope = bucket.Scope("myScope");
var collection = scope.Collection("myCollection");
    Stopwatch sw = Stopwatch.StartNew();
    for (int i = 0; i < 1000; i++)
    {
        string key = $"{r.Next(1, 100000)}";
        var result = await collection.GetAsync(key);
    }
    sw.Stop();
    Console.WriteLine($"Couchbase Q: {q}\t{sw.ElapsedMilliseconds}");
Redis:
    Stopwatch sw = Stopwatch.StartNew();
    using (var client = redisManager.GetClient())
    {
        for (int i = 0; i < nbr; i++)
        {
            client.Get<string>($"jobId:{r.Next(1, 100000)}");
        }
    }
    sw.Stop();
    Console.WriteLine($"Redis Q: {q}\t{sw.ElapsedMilliseconds}");
MySQL:
MySqlConnection mySqlConnection = new MySql.Data.MySqlClient.MySqlConnection("Server=localhost;Database=test;port=3306;User Id=root;password=root;");
mySqlConnection.Open();
            
Stopwatch sw = Stopwatch.StartNew();
for (int i = 0; i < nbr; i++)
{
    MySqlCommand cmd = new MySql.Data.MySqlClient.MySqlCommand($"SELECT data FROM test WHERE Id='{r.Next(1, 100000)}'", mySqlConnection);
    using MySqlDataReader rdr = cmd.ExecuteReader();
    while (rdr.Read())
    {
    }
}
sw.Stop();
Console.WriteLine($"MySql Q: {q} \t{sw.ElapsedMilliseconds} ms");
sw.Reset();

and

and Bucket Durability:

I only have 1 Node (no cluster), it's local on my machine, running Ryzen 3900x 12 cores, M.2 SSD, Win10, 32 GB RAM.
If you made it this far, here is a GitHub repo with my benchmark code: https://github.com/tedekeroth/CouchbaseTests
I took your CouchbaseTests, commented out the non-Couchbase bits. Fixed the query to select from the collection ( myCollection ) instead of jobcache, and removed the Metrics option. And created an index on JobId. create index mybucket_JobId on default:myBucket.myScope.myCollection (JobId) It inserts the 100,000 documents in 19 seconds and kv-fetches the documents on average 146 usec and query by JobId on average 965 usec.
Couchbase Q: 0 187
Couchbase Q: 1 176
Couchbase Q: 2 143
Couchbase Q: 3 147
Couchbase Q: 4 140
Couchbase Q: 5 138
Couchbase Q: 6 136
Couchbase Q: 7 139
Couchbase Q: 8 125
Couchbase Q: 9 129
average et: 146 ms per 1000 -> 146 usec / request
Couchbase Q: 0 1155
Couchbase Q: 1 1086
Couchbase Q: 2 1004
Couchbase Q: 3 901
Couchbase Q: 4 920
Couchbase Q: 5 929
Couchbase Q: 6 912
Couchbase Q: 7 911
Couchbase Q: 8 911
Couchbase Q: 9 927
average et: 965 ms per 1000 -> 965 usec / request. (coincidentally exactly the same as with the java api).
This was on 7.0 build 3739 on a Mac Book Pro with the cbserver running locally.
######################################################################
I have a small LoadDriver application for the java sdk that uses the kv api. With 4 threads, it shows an average response time of 54 micro-seconds and throughput of 73238 requests/second. It uses the travel-sample bucket on a cb server on localhost. [email protected]:mikereiche/loaddriver.git
Run: seconds: 10, threads: 4, timeout: 40000us, threshold: 8000us requests/second: 0 (max), forced GC interval: 0ms count: 729873, requests/second: 72987, max: 2796us avg: 54us, aggregate rq/s: 73238
For the query API I get the following which is 18 times slower.
Run: seconds: 10, threads: 4, timeout: 40000us, threshold: 8000us requests/second: 0 (max), forced GC interval: 0ms count: 41378, requests/second: 4137, max: 12032us avg: 965us, aggregate rq/s: 4144
I would have to run such a comparison myself to do a full investigation, but two things stand out.
Your parallel execution isn't truly fully parallel.  async methods run synchronously up to the first await, so all of the code in InsertAsync/GetAsync before the first await is running sequentially as you add your tasks, not parallel.
CouchbaseNetClient does some lazy connection setup in the background, and you're paying that cost in the timed section. Depending on the environment, including SSL negotiation and such things, this can be a significant initial latency.
You can potentially address the first issue by using Task.Run to kick off the operation, but you may need to pre-size the default Threadpool size.
You can address the second issue by doing at least one operation on the bucket (including bucket.WaitUntilReadyAsync()) before the timed section.
60 seconds for inserts still look abnormal. How many nodes and what Durability setting are you using?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With