I'm often seeing the following message when running nodetool repair
:
[2015-02-10 16:19:40,042] Lost notification. You should check server log for repair status of keyspace xxx
What does it really mean (and how to prevent it if it's dangerous)?
I'm using Cassandra 2.1.2 in four-node cluster.
This message is not harmful by itself. It only means that the nodetool lost the track of the repair status. It does not affect the repair itself. It may be dangerous if you issue next repair command upon completion of the previous command, therefore resulting in multiple concurrent repairs which produces much higher load on the system. I used to have a script (don't have it any more now) that was monitoring logs for the repair cycle start/finish messages triggered by the "lost notification" message in order not to produce competing repairs.
This seems to be a known bug which already has been fixed in the latest releases.
You can always go, as suggested by the error message, to check cassandra`s system log and collect information about the repair activity.
$ cd /var/log/cassandra/
$ cat system.log | grep repair
Please note that i am testing for some purposes a cassandra 2.1.15 and yet encountered the problem. As consideration, since it is not a major bug, not really affecting the repair process, i think it will stick around some time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With