Cassandra node - rebuild v.s. repair

1 Answers

nodetool rebuild: is similar to the bootstrapping process (when you add a new node to the cluster) but for a datacenter. The process here is mainly a streaming from the already live nodes to the new nodes (the new ones are empty). So after defining the key ranges for the nodes which is very fast, the rest can be seen as a copy operation.

nodetool repair -pr: is not a copy operation, the node being repaired is not empty, it already contains data but if the replication factor is greater than 1 that data needs to be compared to the data on the rest of the replicas and if there is a difference it will be corrected. The process involves a lot of streaming but it is not data streaming: the node being repaired requests a merkle tree (basically a tree of hashes) in order to verify if the information both nodes have is the same or not, if not it requests a full stream of the section of the data that has any difference (so all the replicas have the same data). Streaming this hashes if faster than streaming the whole data before verification, this works under the assumption that most data will be the same on both nodes except for some differences here and there. This process also removes tombstones created when deleting from the database, defining like a new "checkpoint" after which new tombstones will be created upon deletion of data, but the old ones will not be used anymore.

Hope it helps!

answered Sep 19 '22 21:09

Sergio Ayestarán

Related questions
                            
                                Escape all strings in JSP/Spring MVC
                            
                                How to use placeholders instead of labels in simple_form?
                            
                                ftp: Name or Service not known
                            
                                With three.js, can I attach information to an object, such as a URL?
                            
                                Facebook SDK iOS invite user without FBWebDialogs
                            
                                Reading big data with fixed width
                            
                                Is `a<b<c` valid python?
                            
                                TypeScript in Visual Studio 2012 not compiling
                            
                                How to create WindowsIdentity/WindowsPrincipal from username in DOMAIN\user format
                            
                                What grayscale conversion algorithm does OpenCV cvtColor() use?
                            
                                Rails:How to create a time column with timezone on postgres
                            
                                Is it possible to sync a single file to s3?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Cassandra node - rebuild v.s. repair

Tags:

Stevan Markovic

People also ask

1 Answers

Sergio Ayestarán

Recent Activity

Donate For Us