Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Cassandra scale horizontally ?

I've watched a video on Cassandra database, which turns to be very effective and really explains a lot about Cassandra. I've also ready some article and books about Cassandra but the thing I could not understand is how does Cassandra scale horizontally. By horizontally scale I mean add more nodes to gain more space. As I understand each node has the identical data i.e if one node has 1TB of data and is replicated to other nodes this means all n nodes will each contain 1TB of data. Am I missing something here ?

like image 735
Adelin Avatar asked Jul 27 '15 11:07

Adelin


People also ask

How is Cassandra scalable?

A single Cassandra instance is called a node. Cassandra supports horizontal scalability achieved by adding more than one node as a part of a Cassandra cluster. The scalability works with linear performance improvement if the resources are configured optimally.

What is linear scalability in Cassandra?

Yes, Cassandra has Linear Scalability. The scalability is linear as shown in the chart below. Each client system generates about 17,500 write requests per second, and there are no bottlenecks as we scale up the traffic. Each client ran 200 threads to generate traffic across the cluster.

Which topology is used in Cassandra?

Cassandra has a ring-type architecture. Cassandra has no master nodes and no single point of failure. Cassandra supports network topology with multiple data centers, multiple racks, and nodes.

How does Cassandra provide high write throughput?

The write is also replicated to multiple other nodes, so if one node loses its Memtable data, there are mechanisms in place for eventual consistency. Writing to in-memory data structure is much faster than writing to disk. Because of this, Cassandra writes are extremely fast!


3 Answers

Yes, you are missing something. Data may not need to be duplicated n times, where n is the number of nodes. You would typically configure your replication factor (RF) to be lower than the number of nodes (N).

For example, RF = 3, N = 5. Meaning each row will be duplicated 3 times across randomly chosen 3 nodes out of 5 nodes (plus the pristine copy). If one node goes down, you will have 3 copies elsewhere on the other nodes.

This works better in larger clusters, e.g. RF = 5, N = 100.

Higher RF improves data redundancy and read speed, but decreases your write speed. So there is a balance, if your RF is very high, like RF = N, you'd have very high data redundancy, high resilience to node failures, and high read throughput. On the other side your write throughput will be very limited, as data needs to be replicated to all the nodes. If one node goes down in this scenario the write may fail (depending on client config) as desired replication factor cannot be achieved.

like image 63
oleksii Avatar answered Sep 23 '22 21:09

oleksii


The number of replicas (i.e. the identical data) you want to store for each partition (row/piece of data) is configurable. So, if you have n nodes, you could in theory set the database to replicate each partition n times. Then, horizontal scaling would not occur if you add more nodes. However, if you set the number of replicas to 1 or 2, you have more space per node to store data horizontally. New data can then go into new nodes. Keep in mind though, that with less replicas you have a greater chance of losing data if any set of nodes go down at a particular time.

like image 38
Rdesmond Avatar answered Sep 21 '22 21:09

Rdesmond


As I understand each node has the identical data i.e if one node has 1TB of data and is replicated to other nodes this means all n nodes will each contain 1TB of data. Am I missing something here ?

Yes, not all nodes are necessarily copies of each other. Depending on the level of availability I want to support, I can set my replication factor lower than the total number of nodes.

Let's say that I have a 2 node cluster with a replication factor of 2. So in this case, each node does have a complete copy of the data. If I am running out of disk, I can alleviate some of that by adding a new node while keeping my replication factor set at 2 (3 nodes, RF of 2).

In this way if each disk has 1TB of storage, and I'm at 900GB on each, adding a new node (while keeping my RF the same) makes each node responsible for only 2/3 of the data. So in this case, each node would hold 600GB of data (freeing up 300GB on my 2 existing nodes). And thus, I have increased my disk capacity by scaling horizontally.

The catch is that even though I have 3 nodes, I can really only afford to lose one of them. If I lose two nodes, then I can't serve my queries.

like image 23
Aaron Avatar answered Sep 19 '22 21:09

Aaron