Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can Cassandra have no single point of failure when it has no master and data is not replicated but distributed?

Tags:

cassandra

Perhaps I am misunderstanding, but the Apache Cassandra Wikipedia article says:

"Every node in the cluster has the same role. There is no single point of failure. Data is distributed across the cluster (so each node contains different data), but there is no master as every node can service any request."

How can each node contain different data, but there be no single point of failure? For instance, I would imagine that in this senario, if a node when down containing the record I was querying, then a different node would pickup that request, however, it would not have the data to satisfy it..since that data was on the node that went down..

Can someone clear this up for me?

Thanks!

like image 716
Casey Jordan Avatar asked Feb 09 '14 19:02

Casey Jordan


People also ask

Why Cassandra has no single point of failure?

Because Cassandra has a single node type, it has only a single set of requirements for hardware, for monitoring, and deployment. By having all nodes share the same role, Cassandra facilitates true distributed systems behavior by removing any single point of failure.

How does Cassandra handle node failure?

If a node is down or unavailable during a write request, Cassandra handles this with the Hinted Handoff -- a mechanism where the coordinator node responsible for managing a write request will store hints (write mutations) and replay it to the replica when it comes back online.

Is Cassandra multi master?

Linear Scaling: due to its multi master architecture, Cassandra is linearly scalable, doubling the number of nodes in a cluster can handle twice the writes.

How replication works in Cassandra?

Cassandra stores data replicas on multiple nodes to ensure reliability and fault tolerance. The replication strategy for each Edge keyspace determines the nodes where replicas are placed. The total number of replicas for a keyspace across a Cassandra cluster is referred to as the keyspace's replication factor.


1 Answers

Cassandra clusters do replicate data across the nodes. The specific number of replicas is configurable, but generally production clusters will use a replication factor of 3. This means that a given row will be stored on three different machines in the cluster. See the reference documentation on replication for more details.

In terms of servicing requests, if a node receives a request for data that it does not have it will forward that request to the nodes that do own the data.

like image 102
psanford Avatar answered Sep 27 '22 18:09

psanford