Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why MongoDB is Consistent not available and Cassandra is Available not consistent?

Mongo

From this resource I understand why mongo is not A(Highly Available) based on below statement

MongoDB supports a “single master” model. This means you have a master node and a number of slave nodes. In case the master goes down, one of the slaves is elected as master. This process happens automatically but it takes time, usually 10-40 seconds. During this time of new leader election, your replica set is down and cannot take writes

Is it for the same reason Mongo is said to be Consistent(as write did not happen so returning the latest data in system ) but not Available(not available for writes) ?

Till re-election happens and write operation is in pending, can slave return perform the read operation ? Also does user re-initiate the write operation again once master is selected ?

But i do not understand from another angle why Mongo is highly consistent As said on Where does mongodb stand in the CAP theorem?,

Mongo is consistent when all reads go to the primary by default.

But that is not true. If under Master/slave model , all reads will go to primary what is the use of slaves then ? It further says If you optionally enable reading from the secondaries then MongoDB becomes eventually consistent where it's possible to read out-of-date results. It means mongo may not be be consistent with master/slaves(provided i do not configure write to all nodes before return). It does not makes sense to me to say mongo is consistent if all read and writes go to primary. In that case every other DB also(like cassandra) will be consistent . Is n't it ?

Cassandra From this resource I understand why Cassandra is A(Highly Available ) based on below statement

Cassandra supports a “multiple master” model. The loss of a single node does not affect the ability of the cluster to take writes – so you can achieve 100% uptime for writes

But I do not understand why cassandra is not Consistent ? Is it because node not available for write(as coordinated node is not able to connect) is available for read which can return stale data ?

like image 681
user3198603 Avatar asked Jun 01 '18 14:06

user3198603


1 Answers

Go through: MongoDB, Cassandra, and RDBMS in CAP, for better understanding of the topic.

A brief definition of Consistency and availability.

Consistency simply means, when you write a piece of data in a system/distributed system, the same data you should get when you read it from any node of the system.

Availability means, the system should always be available for read/write operation.

Note: Most systems are not, only available or only consistent, they always offer a bit of both

With the above definition let's see where MongoDB and Cassandra fall in CAP.

MongoDB

As you said MongoDB is highly consistent when reads and write go to the same node(the default case). Further, you can choose in MongoDB to read from other secondary nodes instead of reading from only leader/primary.

Now, when you try to read data from secondary, your consistency will completely depend on, how you want to read data:

  • You could ask data which is up to maximum, say 5 seconds stale or,
  • You could just say, return data from majority of nodes for your select statement.

Same way when you write from your client into Mongo leader, you can say, a write is successful if the data is replicated to or stored on majority of servers.

Clearly, from above, we can say MongoDb can be highly consistent or eventually consistent based on how you read/write your data.

Now, what about availability? MongoDB is mostly always available, but, the only time when the leader is down, MongoDB can't accept writes, until it figures out the new leader. Hence, not highly available

So, MongoDB is categorized under CP.

What about Cassandra?

In Cassandra, there is no leader and any nodes can accept write, so the Cassandra cluster is always available for writes and reads even if some nodes go down.

What about consistency in Cassandra? Same as MongoDB Cassandra can be eventually consistent or highly consistent based on how you read/write data.

You can give consistency levels in your read/write operations, For example:

  • read/write data from one node
  • read/write data from majority/quorum of nodes and more

Let's say you give a consistency level of one in your read/write operation. So, your write is successful as soon as data is written to one replica. Now, if your read request happens to go to the other replica where the data is not updated yet(could be due to high network latency or any other reason), you will end up reading the old data.

So, Cassandra is highly available but has configurable consistency levels and hence not always consistent.

In conclusion, in their default behavior, MongoDB falls under CP and Cassandra in AP.

like image 132
Bikas Katwal Avatar answered Sep 20 '22 00:09

Bikas Katwal