Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove dead node out of the Cassandra cluster?

  1. I have the cassandra cluster of 12 nodes on EC2.
  2. Because of some failure we lost one of the node completely.I mean that machine do not exist anymore.
  3. So i have created the new EC2 instance with different ip and same token as that of the dead node and i also had the backup of data on that node so it works fine
  4. But the problem is the dead nodes ip still appears as a unreachable node in describe cluster.
  5. As that node (EC2 instance) does not exist anymore I can not use the nodetool decommission or nodetool disablegossip

How can i get rid of this unreachable node

like image 761
samarth Avatar asked Dec 21 '11 12:12

samarth


1 Answers

I had the same problem and I resolved it with removenode, which does not require you to find and change the node token.

First, get the node UUID:

nodetool status

DN  192.168.56.201  ?          256     13.1%  4fa4d101-d8d2-4de6-9ad7-a487e165c4ac  r1
DN  192.168.56.202  ?          256     12.6%  e11d219a-0b65-461e-babc-6485343568f8  r1
UN  192.168.2.91    156.04 KB  256     12.4%  e1a33ed4-d613-47a6-8b3b-325650a2bbd4  RAC1
UN  192.168.2.92    156.22 KB  256     13.6%  3a4a086c-36a6-4d69-8b61-864ff37d03c9  RAC1
UN  192.168.2.93    149.6 KB   256     11.3%  20decc72-8d0a-4c3b-8804-cc8bc98fa9e8  RAC1

As you can see the .201 and .202 are dead and on a different network. These have been changed to .91 and .92 without proper decommissioning and recommissioning. I was working on installing the network and made a few mistakes...

Second, remove the .201 with the following command:

nodetool removenode 4fa4d101-d8d2-4de6-9ad7-a487e165c4ac

(in older versions it was nodetool remove ...)

But just like for the nodetool removetoken ..., it blocks... (see comment by samarth in psandord answer) However, it has a side effect, it puts that UUID in a list of nodes to be removed. So next we can force the removal with:

nodetool removenode force

(in older versions it was nodetool remove ...)

Now the node accepts the command it tells me that it is removing the invalid entry:

RemovalStatus: Removing token (-9136982325337481102). Waiting for replication confirmation from [/192.168.2.91,/192.168.2.92].

We also see that it communicates with the two other nodes that are up and thus it takes a little time, but it is still quite fast.

Next a nodetool status does not show the .201 node. I repeat with .202 and now the status is clean.

After that you may also want to run a cleanup as mentioned in psanford answer:

nodetool cleanup

The cleanup should be run on all nodes, one by one, to make sure the change is fully taken in account.

like image 50
Alexis Wilke Avatar answered Sep 27 '22 15:09

Alexis Wilke