Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Cassandra partitioning work when replication factor == cluster size?

Background:

I'm new to Cassandra and still trying to wrap my mind around the internal workings.

I'm thinking of using Cassandra in an application that will only ever have a limited number of nodes (less than 10, most commonly 3). Ideally each node in my cluster would have a complete copy of all of the application data. So, I'm considering setting replication factor to cluster size. When additional nodes are added, I would alter the keyspace to increment the replication factor setting (nodetool repair to ensure that it gets the necessary data).

I would be using the NetworkTopologyStrategy for replication to take advantage of knowledge about datacenters.

In this situation, how does partitioning actually work? I've read about a combination of nodes and partition keys forming a ring in Cassandra. If all of my nodes are "responsible" for each piece of data regardless of the hash value calculated by the partitioner, do I just have a ring of one partition key?

Are there tremendous downfalls to this type of Cassandra deployment? I'm guessing there would be lots of asynchronous replication going on in the background as data was propagated to every node, but this is one of the design goals so I'm okay with it.

The consistency level on reads would probably generally be "one" or "local_one".

The consistency level on writes would generally be "two".

Actual questions to answer:

  1. Is replication factor == cluster size a common (or even a reasonable) deployment strategy aside from the obvious case of a cluster of one?
  2. Do I actually have a ring of one partition where all possible values generated by the partitioner go to the one partition?
  3. Is each node considered "responsible" for every row of data?
  4. If I were to use a write consistency of "one" does Cassandra always write the data to the node contacted by the client?
  5. Are there other downfalls to this strategy that I don't know about?
like image 864
petrsnd Avatar asked Apr 14 '15 18:04

petrsnd


People also ask

How does replication factor work in Cassandra?

A replication factor of one means that there is only one copy of each row in the Cassandra cluster. A replication factor of two means there are two copies of each row, where each copy is on a different node. All replicas are equally important; there is no primary or master replica.

How is data partitioned in Cassandra?

As we learned earlier, Cassandra uses a consistent hashing technique to generate the hash value of the partition key (app_name) and assign the row data to a partition range inside a node.

What is Cassandra partition size?

The maximum partition size in Cassandra should be under 100MB and ideally less than 10MB. Application workload and its schema design haves an effect on the optimal partition value. However, a maximum of 100MB is a rule of thumb.


1 Answers

Do I actually have a ring of one partition where all possible values generated by the partitioner go to the one partition?

Is each node considered "responsible" for every row of data?

If all of my nodes are "responsible" for each piece of data regardless of the hash value calculated by the partitioner, do I just have a ring of one partition key?

Not exactly, C* nodes still have token ranges and c* still assigns a primary replica to the "responsible" node. But all nodes will also have a replica with RF = N (where N is number of nodes). So in essence the implication is the same as what you described.

Are there tremendous downfalls to this type of Cassandra deployment? Are there other downfalls to this strategy that I don't know about?

Not that I can think of, I guess you might be more susceptible than average to inconsistent data so use C*'s anti-entropy mechanisms to counter this (repair, read repair, hinted handoff).

Consistency level quorum or all would start to get expensive but I see you don't intend to use them.

Is replication factor == cluster size a common (or even a reasonable) deployment strategy aside from the obvious case of a cluster of one?

It's not common, I guess you are looking for super high availability and all your data fits on one box. I don't think I've ever seen a c* deployment with RF > 5. Far and wide RF = 3.

If I were to use a write consistency of "one" does Cassandra always write the data to the node contacted by the client?

This depends on your load balancing policies at the driver. Often we select token aware policies (assuming you're using one of the Datastax drivers), in which case requests are routed to the primary replica automatically. You could use round robin in your case and have the same effect.

like image 166
phact Avatar answered Sep 19 '22 06:09

phact