I can't understand the difference between Snitch and Gossip in Cassandra, and I can't find even one source which has discussed the subject, let alone providing a good answer. Seems to me that Snitch and Gossip are both inter-node communication protocols; so why do we need 2 of them?
I know that Gossip helps a node to get information from bootstrap nodes, but that doesn't really explain the difference since when a node starts, it needs to learn about the data centers and racks as well which is supposed to be the domain of the Snitch.
A snitch determines which datacenters and racks nodes belong to. A snitch determines which datacenters and racks nodes belong to. They inform Cassandra about the network topology so that requests are routed efficiently and allows Cassandra to distribute replicas by grouping machines into datacenters and racks.
Cassandra uses a protocol called gossip to discover location and state information about the other nodes participating in a Cassandra cluster.
Ec2Snitch: It is important snitch for deployments and it is a simple snitch for Amazon EC2 deployments where all nodes are in a single region. In Ec2Snitch region name refers to data center and availability zone refers to rack in a cluster.
Cassandra is a peer-to-peer, read/write anywhere architecture, so any user can connect to any node in any data center and read/write the data they need, with all writes being partitioned and replicated for them automatically throughout the cluster.
Gossip is a protocol and Snitch is a component which utilizes it. Snitch is a little bit more than gossip and it has at least some heuristics like identifying data centers or racks while gossip is like a convenient tool to get this information. Almost all that gossip is doing is spreading arround with some rules to cover all necessary nodes and receive some technical data like ip, health etc. While Snitch utilizes this info to perform something more. One of its features is to identify different data centers by analyzing received ips. Then this info is used by other components for further actions like replicas location etc. So they've decided to give this functionality separate name to identify it and actually it's all about layering the functionality.
Some relevant information also can be found here: https://books.google.ru/books?id=h36CCwAAQBAJ&pg=PT21&lpg=PT21&dq=snitch+gossip&source=bl&ots=fjxy_z78Gj&sig=KpqdkKaREIo2YAWyJj3yMZCyNn4&hl=ru&sa=X&ved=0ahUKEwiUktS8q8zWAhWIQZoKHTViD0U4ChDoAQhUMAc#v=onepage&q=snitch%20gossip&f=false
And here is a more detailed snitch definition (but in scylla): https://github.com/scylladb/scylla/wiki/Snitches
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With