Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cassandra: Snitch vs. Gossip

Tags:

cassandra

I can't understand the difference between Snitch and Gossip in Cassandra, and I can't find even one source which has discussed the subject, let alone providing a good answer. Seems to me that Snitch and Gossip are both inter-node communication protocols; so why do we need 2 of them?

I know that Gossip helps a node to get information from bootstrap nodes, but that doesn't really explain the difference since when a node starts, it needs to learn about the data centers and racks as well which is supposed to be the domain of the Snitch.

like image 539
user1888243 Avatar asked Sep 30 '17 01:09

user1888243


People also ask

What is snitch in Cassandra?

A snitch determines which datacenters and racks nodes belong to. A snitch determines which datacenters and racks nodes belong to. They inform Cassandra about the network topology so that requests are routed efficiently and allows Cassandra to distribute replicas by grouping machines into datacenters and racks.

Does Cassandra use gossip protocol?

Cassandra uses a protocol called gossip to discover location and state information about the other nodes participating in a Cassandra cluster.

What is EC2 snitch?

Ec2Snitch: It is important snitch for deployments and it is a simple snitch for Amazon EC2 deployments where all nodes are in a single region. In Ec2Snitch region name refers to data center and availability zone refers to rack in a cluster.

How Cassandra read and write works?

Cassandra is a peer-to-peer, read/write anywhere architecture, so any user can connect to any node in any data center and read/write the data they need, with all writes being partitioned and replicated for them automatically throughout the cluster.


1 Answers

Gossip is a protocol and Snitch is a component which utilizes it. Snitch is a little bit more than gossip and it has at least some heuristics like identifying data centers or racks while gossip is like a convenient tool to get this information. Almost all that gossip is doing is spreading arround with some rules to cover all necessary nodes and receive some technical data like ip, health etc. While Snitch utilizes this info to perform something more. One of its features is to identify different data centers by analyzing received ips. Then this info is used by other components for further actions like replicas location etc. So they've decided to give this functionality separate name to identify it and actually it's all about layering the functionality.

Some relevant information also can be found here: https://books.google.ru/books?id=h36CCwAAQBAJ&pg=PT21&lpg=PT21&dq=snitch+gossip&source=bl&ots=fjxy_z78Gj&sig=KpqdkKaREIo2YAWyJj3yMZCyNn4&hl=ru&sa=X&ved=0ahUKEwiUktS8q8zWAhWIQZoKHTViD0U4ChDoAQhUMAc#v=onepage&q=snitch%20gossip&f=false

And here is a more detailed snitch definition (but in scylla): https://github.com/scylladb/scylla/wiki/Snitches

like image 90
Stepan Pogosyan Avatar answered Oct 12 '22 00:10

Stepan Pogosyan