I have a small cluster that is virtually empty. Usually nodetool removenode
completes in on the order of 10s of seconds. However, I currently have a node removal in process that is taking 10s of minutes and isn't seeming to make any progress. An additional request to remove the node is rejected because there is already a removal in progress. How can I troubleshoot this? For reference, here is the output to nodetool status
:
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
DL 192.168.12.207 152.14 KB 256 32.2% 683d8351-c625-4d7f-99cc-61f6b73b0c56 rack1
UN 192.168.12.205 215.21 KB 256 37.2% b66d5fff-ef1d-4fbf-a49a-43709df99a0c rack1
UN 192.168.12.208 148.09 KB 256 30.6% 39b54771-59b8-49f7-8db8-9cf4523d6c8d rack1
Also, cassandra is not running on host 207 (the leaving host), but is running on the other two hosts.
EDIT: It seems there is at least one token that is stuck awaiting replication:
$ nodetool removenode status
RemovalStatus: Removing token (-9037887679483580088). Waiting for replication confirmation from [/192.168.12.205].
Don't know which version of Cassandra is the one with the problem. But, if nodetool removenode is not working, according to the Apache Cassandra Wiki, you should try the following:
Removenode
Removing a node that does not physically exist anymore is done in two steps:
bin/nodetool removenode <UUID> bin/nodetool removenode force
The first command will block forever if the computer attached to that UUID was physically removed (or does not run Cassandra anymore). Just click Ctrl-C after a second or two before running the second command. Obviously, it is better to first decommission a node if possible or you may lose some of your data.
The "bin/nodetool status" command shows the UUID of your nodes.
Hope it helps .
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With