Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cassandra Vnodes and token Ranges

Tags:

cassandra

I know that Vnodes form many token ranges for each node by setting num_tokens in cassandra.yaml file.

say for example (a), i have 6 nodes, each node i have set num_token=256. How many virtual nodes are formed among these 6 nodes that is, how many virtual nodes or sub token ranges contained in each physical node.

According to my understanding, when every node has assigned num_token as 256, then it means that all the 6 nodes contain 256 vnodes each. Is this statement true? if not then, how vnodes form the range of tokens (obviously random) in each node. It would be really convenient if someone can explain me with the example mentioned as (a).

what is the Ring of Vnodes signify in this url:=> http://docs.datastax.com/en/cassandra/3.x/cassandra/images/arc_vnodes_compare.png (taken from: http://www.datastax.com/dev/blog/virtual-nodes-in-cassandra-1-2 )

like image 335
Sandhya Km Avatar asked Jun 21 '16 09:06

Sandhya Km


People also ask

What is Cassandra token range?

Every partition key in Cassandra is converted to a numerical token value using the MurMur3 hash function. The token range is between -2^63 to +2^63 -1 num_token defines how many token ranges are assigned to a node. this is the same as the signed java long.

What are Vnodes in Cassandra?

Virtual nodes in a Cassandra cluster are also called vnodes. Vnodes can be defined for each physical node in the cluster. Each node in the ring can hold multiple virtual nodes. By default, each node has 256 virtual nodes.

What is token in Cassandra?

A token is the hashed value of the primary key. When you add nodes to Cassandra you assign a token range to each node, or let Cassandra do that for you. Then when you add data to Cassandra it calculates the token and uses that to figure out on which server (node) to store the new data.

How many writes per second can Cassandra handle?

In an article we posted in November 2011, Benchmarking Cassandra Scalability on AWS — Over a million writes per second, we showed how Cassandra (C*) scales linearly as you add more nodes to a cluster.


1 Answers

Every partition key in Cassandra is converted to a numerical token value using the MurMur3 hash function. The token range is between -2^63 to +2^63 -1 num_token defines how many token ranges are assigned to a node. this is the same as the signed java long. Each node calculates 256 (num_tokens) random values in the token range and informs other nodes what they are, thus when a node needs to coordinate a request for a specific token it knows which nodes are responsible for it, according to the Replication Factor and DC/rack placement. A better description for this feature would be "automatic token range assignment for better streaming capabilities", calling it "virtual" is a bit confusing. In your case you have 6 nodes, each set with 256 token ranges so you have 6*256 token ranges and each psychical node contains 256 token ranges.

For example consider 2 nodes with num_tokens set to 4 and token range 0 to 100. Node 1 calculates tokens 17, 35, 77, 92 Node 2 calculates tokens 4, 25, 68, 85 The ring shows the distribution of token ranges in this case Node 2 is responsible for token ranges 4-17, 25-35, 68-77, 85-92 and node 1 for the rest.

like image 119
Oded Peer Avatar answered Sep 29 '22 02:09

Oded Peer