Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cassandra nodetool repair best practices

Tags:

cassandra

This question applies to Cassandra 2.2

I am embarrassed to say that I still do not understand when I should be running a nodetool repair, or to be more precise on which nodes.

So far, I understand that to ensure deletes are handled correctly I should be running a repair at a frequency that is less than the GC_GRACE_SECONDS. So that's cool got that bit.

Q. If I have a cluster of 9 nodes with a replication factor of 3, what type of repair do I run? more importantly do I run the repair on every node, or just one node?

Q. If I have multiple data centers, does that change how I run repairs. Do I have to run them in each DC, or can it be coordinated from just one node in one DC?

I am hoping this is a trivial question and someone can just tell it how it is.

like image 735
L. Smith Avatar asked Jun 20 '16 11:06

L. Smith


People also ask

What does Cassandra Nodetool repair do?

The repair command repairs one or more nodes in a cluster, and provides options for restricting repair to a set of nodes. Anti-entropy node repair performs the following tasks: Ensures that all data on a replica is consistent. Repairs inconsistencies on a node that has been down.

What is incremental repair in Cassandra?

A full repair of all SSTables on a node takes a lot of time and is resource-intensive. You can manage repairs with less service disruption using incremental repair. Incremental repair consumes less time and resources because it skips SSTables that are already marked as repaired.


1 Answers

The nodetool repair command can be run on either a specified node or on all nodes if a node is not specified. The node that initiates the repair becomes the coordinator node for the operation.

If node it not specified it runs on all the nodes that is responsible for that partition range.

run nodetool repair -pr on every node in the cluster to repair all data. Otherwise, some ranges of data will not be repaired

The nodetool repair -pr option is good for repairs across multiple datacenters.

Note: For Cassandra 2.2 and later, a recommended option for repairs across datacenters: use the -dcpar or --dc-parallel to repair datacenters in parallel.

Nodetool Repair

like image 86
undefined_variable Avatar answered Oct 18 '22 18:10

undefined_variable