Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

nodetool removenode stuck during removal

Tags:

cassandra

I have a small cluster that is virtually empty. Usually nodetool removenode completes in on the order of 10s of seconds. However, I currently have a node removal in process that is taking 10s of minutes and isn't seeming to make any progress. An additional request to remove the node is rejected because there is already a removal in progress. How can I troubleshoot this? For reference, here is the output to nodetool status:

Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens  Owns   Host ID                               Rack
DL  192.168.12.207  152.14 KB  256     32.2%  683d8351-c625-4d7f-99cc-61f6b73b0c56  rack1
UN  192.168.12.205  215.21 KB  256     37.2%  b66d5fff-ef1d-4fbf-a49a-43709df99a0c  rack1
UN  192.168.12.208  148.09 KB  256     30.6%  39b54771-59b8-49f7-8db8-9cf4523d6c8d  rack1

Also, cassandra is not running on host 207 (the leaving host), but is running on the other two hosts.

EDIT: It seems there is at least one token that is stuck awaiting replication:

$ nodetool removenode status
RemovalStatus: Removing token (-9037887679483580088). Waiting for replication confirmation from [/192.168.12.205].
like image 699
jonderry Avatar asked Sep 19 '14 22:09

jonderry


1 Answers

Don't know which version of Cassandra is the one with the problem. But, if nodetool removenode is not working, according to the Apache Cassandra Wiki, you should try the following:

Removenode

Removing a node that does not physically exist anymore is done in two steps:

  bin/nodetool removenode <UUID> 

  bin/nodetool removenode force

The first command will block forever if the computer attached to that UUID was physically removed (or does not run Cassandra anymore). Just click Ctrl-C after a second or two before running the second command. Obviously, it is better to first decommission a node if possible or you may lose some of your data.

The "bin/nodetool status" command shows the UUID of your nodes.

Hope it helps .

like image 159
juliccr Avatar answered Oct 07 '22 00:10

juliccr