One of the nodes in a cassandra cluster has died.
I'm using cassandra 2.0.7 throughout.
When I do a nodetool status this is what I see (real addresses have been replaced with fake 10 nets)
[root@beta-new:/opt] #nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.10.1.94 171.02 KB 256 49.4% fd2f76ae-8dcf-4e93-a37f-bf1e9088696e rack1
DN 10.10.1.98 ? 256 50.6% f2a48fc7-a362-43f5-9061-4bb3739fdeaf rack1
I tried to get the token ID for the down node by doing a nodetool ring command, grepping for the IP and doing a head -1 to get the initial one.
[root@beta-new:/opt] #nodetool ring | grep 10.10.1.98 | head -1
10.10.1.98 rack1 Down Normal ? 50.59% -9042969066862165996
I then started following this documentation on how to replace the node:
[http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html?scroll=task_ds_aks_15q_gk][1]
So I installed cassandra on a new node but did not start it.
Set the following options:
cluster_name: 'Jokefire Cluster'
seed_provider:
- seeds: "10.10.1.94"
listen_address: 10.10.1.94
endpoint_snitch: SimpleSnitch
And set the initial token of the new install as the token -1 of the node I'm trying to replace in cssandra.yaml:
initial_token: -9042969066862165995
And after making sure there was no data yet in: /var/lib/cassandra
I started up the database:
[root@web2:/etc/alternatives/cassandrahome] #./bin/cassandra -f -Dcassandra.replace_address=10.10.1.98
The documentation I link to above says to use the replace_address directive on the command line rather than cassandra-env.sh if you have a tarball install (which we do) as opposed to a package install.
After I start it up, cassandra fails with the following message:
Exception encountered during startup: Cannot replace_address /10.10.10.98 because it doesn't exist in gossip
So I'm wondering at this point if I've missed any steps or if there is anything else I can try to replace this dead cassandra node?
In the cassandra. yaml file for each node, remove the IP address of the dead node from the - seeds list in the seed-provider property. If the cluster needs a new seed node to replace the dead node, add the new node's IP address to the - seeds list of the other nodes.
Cassandra provides a command-line utility for both scenarios. If a node is down and we want to remove it from the cluster, then we should run nodetool removenode <Dead-Node-ID> from some other node. This command streams the data for which this node was responsible to remaining nodes from the remaining replicas.
Cassandra uses a protocol called gossip to discover location and state information about the other nodes participating in a Cassandra cluster. Gossip is a peer-to-peer communication protocol in which nodes periodically exchange state information about themselves and about other nodes they know about.
When a node comes back online after an outage, it may have missed writes for the replica data it maintains. Repair mechanisms exist to recover missed data, such as hinted handoffs and manual repair with nodetool repair. The length of the outage will determine which repair mechanism is used to make the data consistent.
Has the rest of your cluster been restarted since the node failure, by chance? Most gossip information does not survive a full restart, so you may genuinely not have gossip information for the down node.
This issue was reported as a bug CASSANDRA-8138, and the answer was:
I think I'd much rather say that the edge case of a node dying, and then a full cluster restart (rolling would still work) is just not supported, rather than make such invasive changes to support replacement under such strange and rare conditions. If that happens, it's time to assassinate the node and bootstrap another one.
So rather than replacing your node, you need to remove the failed node from the cluster and start up a new one. If using vnodes, it's quite straightforward.
Discover the node ID of the failed node (from another node in the cluster)
nodetool status | grep DN
And remove it from the cluster:
nodetool removenode (node ID)
Now you can clear out the data directory of the failed node, and bootstrap it as a brand-new one.
Some less known issues of Cassandra dead node replacement has been captured in below link based on my experience:
https://github.com/laxmikant99/cassandra-single-node-disater-recovery-lessons
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With