Read couple of articles on net regarding MongoDB vs. Cassandra read/write performance,
Write
It is generally said that Cassandra's write performance is better than Mongo's when data is humongous. See the statement below.
Cassandra's storage engine provides constant-time writes no matter how big your data set grows. Writes are more problematic in MongoDB, partly because of the b-tree based storage engine, but more because of the per database write lock.
Here is my question :- Is this statement still correct? Per my understanding Mongo supports lock per document instead per database . Right? So at present is Cassandra still better than Mongo in write performnace? if yes why?
Read
It is generally said that Mongo's read performance is better than Cassandra's but I did not find any reasoning what makes Mongo's read better than Cassandra's ?
Update :-
From Jared Answer at this forum
Reads are more efficient in MongoDB's storage engine than they are in Cassandra. Cassandra's storage engine performs very well on writes because it stores data in an append only format. This makes great use of spinning disk drives that have poor seek times, but can do serial writes very quickly. But the downside is that when you do a read, you often need to scan through several versions of an object to get the most recent version to return to the caller. MongoDB updates data in place. This means it does more random IO when writes are processed, but it has the benefit of being faster when processing reads, since you can find the exact location of the object on disk in one b-tree lookup.
It helped me to understand Cassandra is faster while delete/edit on existing record because it has to just append it at last instead of in place edit like Mongo which has to search first and then edit it. This makes cassandra better in write than Mongo
But the same thing thing makes Mongo slower than Cassandra because Cassandra has to scan through several versions of an same record to get the most recent version to return to the caller
Another reason from this blog why cassandra is better in write
MongoDB with its “single master” model can take writes only on the primary. The secondary servers can only be used for reads. So essentially if you have three node replica set, only the master is taking writes and the other two nodes are only used for reads. This greatly limits write scalability. You can deploy multiple shards but essentially only 1/3 of your data nodes can take writes. Cassandra with its “multiple master” model can take writes on any server. Essentially your write scalability is limited by the number of servers you have in the cluster. The more servers you have in the cluster, the better it will scale.
From the same blog why Mongo is better in read than cassandra
Secondary indexes are a first-class construct in MongoDB. This makes it easy to index any property of an object stored in MongoDB even if it is nested. This makes it really easy to query based on these secondary indexes. Cassandra has only cursory support for secondary indexes. Secondary indexes are also limited to single columns and equality comparisons. If you are mostly going to be querying by the primary key then Cassandra will work well for you.
Answers to questions: Yes. Latest MongoDB supports locks per document. https://docs.mongodb.com/manual/core/wiredtiger/
Here are benchmarks of write operations: https://www.datastax.com/nosql-databases/benchmarks-cassandra-vs-mongodb-vs-hbase According to these benchmarks, Cassandra performs better at scale (on the higher number of nodes in the cluster).
Hope it will help you.
Here are some details regarding your question which also might help.
Regarding Cassandra
Cassandra is using LSM-tree which is optimized for heavy writes. https://docs.datastax.com/en/cassandra/2.1/cassandra/dml/dml_manage_ondisk_c.html
Some details:
When performing a write, the data is immediately written to a commit log. The commit log is a crash-recovery mechanism. A write is not considered successful until it’s written to the commit log. After the data is written to the commit log, it’s written to memtable. In recent versions of Cassandra, memtables are stored mostly in native memory and not in JVM heap. So it also improves performance.
When the number of objects stored in the memtable reaches a threshold, the contents of the memtable are flushed to disk in a file called an SSTable. A new memtable is then created. Once a memtable is flushed to an SSTable, it is immutable.
No reads or seeks of any kind are required for writing a value to Cassandra because all writes are append operations.
Regarding MongoDB
By default, MongoDB is using MMAPv1 storage engine which is using B-trees (https://docs.mongodb.com/manual/core/mmapv1/), but recent versions of MongoDB use WiredTiger storage engine (https://docs.mongodb.com/manual/core/wiredtiger/) which can also support LSM-tree.
With respect to locks: WiredTiger MongoDB supports document-level locks but MMAPv1 supports collection-level concurrency control.
Some useful articles:
https://dba.stackexchange.com/questions/121160/mongodb-mmapv1-vs-wiredtiger-storage-engines
https://docs.mongodb.com/manual/faq/concurrency/
https://www.percona.com/blog/2016/01/06/mongodb-revs-you-up-what-storage-engine-is-right-part-1/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With