Best way to add multiple nodes to existing cassandra cluster

Tags:

cassandra-2.0

We have a 12 node cluster with 2 datacenters(each DC has 6 nodes) with RF-3 in each DC.

We are planning to increase cluster capacity by adding 3 nodes in each DC(total 6 nodes). What is the best way to add multiple nodes at once.(ya,may be with 2 min difference).

auto_bootstrap:false - Use auto_bootstrap:false(as this is quicker process to start nodes) on all new nodes , start all nodes & then run 'nodetool rebuild' to get data streamed to this new nodes from exisitng nodes.

If I go this way , where read requests go soon starting this new nodes , as at this point it has only token range assigned to them(new nodes) but NO data got streamed to this nodes , will it cause Read request failures/CL issues/any other issue?

auto_bootstrap:true - Use auto_bootstrap:true and then start one node at a time , wait until streaming process finishes(this might take time I guess as we have huge data approx 600 GB+ on each node) before starting next node. If I go this way , I have to wait until whole streaming process done on a node before proceed adding next new node.

Kindly suggest a best way to add multiple nodes all at once.

PS: We are using c*-2.0.3.

Thanks in advance.

590

asked May 17 '16 18:05

techpyaasa

2 Answers

As each depends on streaming data over the network, it largely depends how distributed your cluster is, and where your data currently is.

If you have a single-DC cluster and latency is minimal between all nodes, then bringing up a new node with auto_bootstrap: true should be fine for you. Also, if at least one copy of your data has been replicated to your local datacenter (the one you are joining the new node to) then this is also the preferred method.

On the other hand, for multiple DCs I have found more success with setting auto_bootstrap: false and using nodetool rebuild. The reason for this, is because nodetool rebuild allows you to specify a data center as the source of the data. This path gives you the control to contain streaming to a specific DC (and more importantly, not to other DCs). And similar to the above, if you are building a new data center and your data has not yet been fully-replicated to it, then you will need to use nodetool rebuild to stream data from a different DC.

how read requests would be handled ?

In both of these scenarios, the token ranges would be computed for your new node when it joins the cluster, regardless of whether or not the data is actually there. So if a read request were to be sent to the new node at CL ONE, it should be routed to a node containing a secondary replica (assuming RF>1). If you queried at CL QUORUM (with RF=3) it should find the other two. That is of course, assuming that the nodes which are picking-up the slack are not overcome by their streaming activities that they cannot also serve requests. This is a big reason why the "2 minute rule" exists.

The bottom line, is that your queries do have a higher chance of failing before the new node is fully-streamed. Your chances of query success increase with the size of your cluster (more nodes = more scalability, and each bears that much less responsibility for streaming). Basically, if you are going from 3 nodes to 4 nodes, you might get failures. If you are going from 30 nodes to 31, your app probably won't notice a thing.

Will the new node try to pull data from nodes in the other data centers too?

Only if your query isn't using a LOCAL consistency level.

answered Oct 21 '22 14:10

Aaron

I'm not sure this was ever answered:

If I go this way , where read requests go soon starting this new nodes , as at this point it has only token range assigned to them(new nodes) but NO data got streamed to this nodes , will it cause Read request failures/CL issues/any other issue?

And the answer is yes. The new node will join the cluster, receive the token assignments, but since auto_bootstrap: false, the node will not receive any streamed data. Thus, it will be a member of the cluster, but will not have any old data. New writes will be received and processed, but existing data prior to the node joining, will not be available on this node.

With that said, with the correct CL levels, your new node will still do background and foreground read repair, so that it shouldn't respond any differently to requests. However, I would not go this route. With 2 DC's, I would divert traffic to DCA, add all of the nodes with auto_bootstrap: false to DCB, and then rebuild the nodes from DCA. The rebuild will need to be from DCA because the tokens have changed in DCB, and with auto_bootstrap: false, the data may no longer exist. You could also run repair, and that should resolve any discrepancies as well. Lastly, after all of the nodes have been bootstrapped, run cleanup on all of the nodes in DCB.

answered Oct 21 '22 14:10

stevenlacerda

Related questions
                            
                                Cassandra 3.10 debug.log contains frequent "FailureDetector.java:457 - Ignoring interval time of..."
                            
                                Apache Cassandra and Windows
                            
                                Cassandra table with multiple counter columns
                            
                                Cassandra Powershell Issue
                            
                                Cassandra update value in one of the clustering column
                            
                                Dynamically adding new nodes in Cassandra
                            
                                Cassandra Error message: Not marking nodes down due to local pause. Why?
                            
                                Is there a reason that Cassandra doesn't have Geospatial support?
                            
                                Clarifications about nodetool repair -pr
                            
                                Cassandra Allow filtering
                            
                                Mapping Cassandra Super Columns
                            
                                Paging Resultsets in Cassandra with compound primary keys - Missing out on rows
                            
                                Combine results from batch RDD with streaming RDD in Apache Spark
                            
                                Cassandra IN query not working if table has SET type column
                            
                                Streaming data from Kafka into Cassandra in real time
                            
                                modelling cassandra tables for upsert and select query
                            
                                Database that consumes less disk space
                            
                                How should I copy a keyspace within a cluster
                            
                                Is TTL for Cassandra counter column family supported?
                            
                                Pandas and Cassandra: numpy array format incompatibility

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With