Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Methods to Verify Cassandra Node Sync

I have a 3 node Cassandra cluster with replication factor of 2. Because one of the nodes has been replaced with a new one. And I have used "nodetool repair" to repair all the keyspaces. But don't know how to verify that all the keyspaces are synced.

Before, Just found this article would help, but a little. Cassandra Data Replication problem

Is there any way to verify the keyspaces with replication factor > 1 in Cassandra?

Thanks a lot.

stephon

like image 763
stephon Avatar asked Mar 27 '12 07:03

stephon


People also ask

How do you check Cassandra nodes?

Check the status of the Cassandra nodes in your cluster - Go to the /<Install_Dir>/apache-cassandra/bin/ directory and type the ./nodetool status command. If the status for all the nodes shows as UN , then the nodes are up and running. If the status for any node shows as DN , then that particular node is down.

How is node failure detected in Cassandra?

Rather than have a fixed threshold for marking failing nodes, Cassandra uses an accrual detection mechanism to calculate a per-node threshold that takes into account network performance, workload, and historical conditions.

How do you check for replication in Cassandra?

If you want to look at the replication factor of a given keyspace, simply execute SELECT * FROM system_schema. keyspaces; and it will print all replication information you need.


1 Answers

First, if you run nodetool repair again and very little data is transferred (assuming all nodes have been up since the last time you ran), you know that the data is almost perfectly in sync. You can look at the logs to see numbers on how much data is transferred during this process.

Second, you can verify that all of the nodes are getting a similar number of writes by looking at the write counts with nodetool cfstats. Note that the write count value is reset each time Cassandra restarts, so if they weren't restarted around the same time, you'll have to see how quickly they are each increasing over time.

Last, if you just want to spot check a few recently updated values, you can try reading those values at consistency level ONE. If you always get the most up-to-date version of the data, you'll know that the replicas are likely in sync.

As a general note, replication is such an ingrained part of Cassandra that it's extremely unlikely to fail on its own without you noticing. Typically a node will be marked down shortly after problems start. Also, I'm assuming you're writing at consistency level ONE or ANY; with anything higher, you know for sure that both of the replicas have received the write.

like image 56
Tyler Hobbs Avatar answered Oct 16 '22 00:10

Tyler Hobbs