We are looking at a document db storage solution with fail over clustering, for some read/write intensive application. We will be having an average of 40K concurrent writes per second written to the db (with peak can go up to 70,000 during) - and may have around almost similiar number of reads happening. We also need a mechanism for the db to notify about the newly written records (some kind of trigger at db level). What will be a good option in terms of a proper choice of document db and related capacity planning? Updated More details on the expectation. <ul> <li>On an average, we are expecting 40,000 (40K) Number of inserts (new documents) per second across 3-4 databases/document collections.</li> <li>The peak may go up to 120,000 (120K) Inserts</li> <li>The Inserts should be readable right away - almost realtime</li> <li>Along with this, we expect around 5000 updates or deletes per second</li> <li>Along with this, we also expect 500-600 concurrent queries accessing data. These queries and execution plans are somewhat known, though this might have to be updated, like say, once in a week or so.</li> <li>The system should support failover clustering on the storage side</li> </ul>

I would recommend MongoDB. My requirements wasn't nearly as high as yours but it was reasonably close. Assuming you'll be using C#, I recommend the official MongoDB C# driver and the InsertBatch method with SafeMode turned on. It will literally write data as fast as your file system can handle. A few caveats: <ol> <li>MongoDB does not support triggers (at least the last time I checked).</li> <li>MongoDB initially caches data to RAM before syncing to disk. If you need real-time needs with durability, you might want to set fsync lower. This will have a significant performance hit.</li> <li>The C# driver is a little wonky. I don't know if it's just me but I get odd errors whenever I try to run any long running operations with it. The C++ driver is much better and actually faster than the C# driver (or any other driver for that matter).</li> </ol> That being said, I'd also recommend looking into RavenDB as well. It supports everything you're looking for but for the life of me, I couldn't get it to perform anywhere close to Mongo. The only other database that came close to MongoDB was Riak. Its default Bitcask backend is ridiculously fast as long as you have enough memory to store the keyspace but as I recall it doesn't support triggers.

Choosing MongoDb/CouchDb/RavenDb - performance and scalability advice [closed]

3 Answers

if "20,000 concurrent writes" means inserts then I would go for CouchDB and use "_changes" api for triggers. But with 20.000 writes you would need a stable sharding aswell. Then you would better take a look at bigcouch

And if "20.000" concurrent writes consist "mostly" updates I would go for MongoDB for sure, since Its "update in place" is pretty awesome. But then you should handle triggers manually, but using another collection to update in place a general document can be a handy solution. Again be careful about sharding.

Finally I think you cannot select a database with just concurrency, you need to plan the api (how you would retrieve data) then look at options in hand.

112

answered Oct 23 '22 08:10

frail

I would recommend MongoDB. My requirements wasn't nearly as high as yours but it was reasonably close. Assuming you'll be using C#, I recommend the official MongoDB C# driver and the InsertBatch method with SafeMode turned on. It will literally write data as fast as your file system can handle. A few caveats:

MongoDB does not support triggers (at least the last time I checked).
MongoDB initially caches data to RAM before syncing to disk. If you need real-time needs with durability, you might want to set fsync lower. This will have a significant performance hit.
The C# driver is a little wonky. I don't know if it's just me but I get odd errors whenever I try to run any long running operations with it. The C++ driver is much better and actually faster than the C# driver (or any other driver for that matter).

That being said, I'd also recommend looking into RavenDB as well. It supports everything you're looking for but for the life of me, I couldn't get it to perform anywhere close to Mongo.

The only other database that came close to MongoDB was Riak. Its default Bitcask backend is ridiculously fast as long as you have enough memory to store the keyspace but as I recall it doesn't support triggers.

answered Oct 23 '22 08:10

Rahul Ravindran

Membase (and the soon-to-be-released Couchbase Server) will easily handle your needs and provide dynamic scalability (on-the-fly add or remove nodes), replication with failover. The memcached caching layer on top will easily handle 200k ops/sec, and you can linearly scale out with multiple nodes to support getting the data persisted to disk.

We've got some recent benchmarks showing extremely low latency (which roughly equates to high throughput): http://10gigabitethernet.typepad.com/network_stack/2011/09/couchbase-goes-faster-with-openonload.html

Don't know how important it is for you to have a supported Enterprise class product with engineering and QA resources behind it, but that's available too.

Edit: Forgot to mention that there is a built-in trigger interface already, and we're extending it even further to track when data hits disk (persisted) or is replicated.

Perry

answered Oct 23 '22 07:10

Perry krug

Related questions
                            
                                Mongodb: when to call ensureIndex?
                            
                                Force mongodb to output strict JSON
                            
                                Severe performance drop with MongoDB Change Streams
                            
                                How do I abort a running query in the MongoDB shell?
                            
                                What is returned from Mongoose query that finds no matches?
                            
                                How to resolve error :dbpath (/data/db/) does not exist permanently in MongoDB
                            
                                How to set _id to db document in Mongoose?
                            
                                MongoDB How to know Primary DB server ip in a replica set?
                            
                                MongoDB - Unwind array using aggregation and remove duplicates
                            
                                In MongoDB's pymongo, how do I do a count()?
                            
                                Limiting results in MongoDB but still getting the full count?
                            
                                Nested objects in mongoose schemas
                            
                                How to deserialize a BsonDocument object back to class
                            
                                Storing and querying JSON from a database
                            
                                MongoDB - installation error - mongodb setup wizard ended prematurely
                            
                                How to find min value in mongodb
                            
                                MongoDB: Can't canonicalize query: BadValue Projection cannot have a mix of inclusion and exclusion
                            
                                How to check if mongo db is running on Mac?
                            
                                Do something if nothing found with .find() mongoose
                            
                                MongoDB and C#: Case insensitive search

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Choosing MongoDb/CouchDb/RavenDb - performance and scalability advice [closed]

Tags:

mongodb

nosql

couchdb

document-database

ravendb

amazedsaint

People also ask

3 Answers

frail

Rahul Ravindran

Perry krug

Recent Activity

Donate For Us