Cassandra nodetool
has a command called cleanup
:
cleanup [keyspace][cf_name]
Triggers the immediate cleanup of keys no longer belonging to this node. This has roughly the same effect on a node that a major compaction does in terms of a temporary increase in disk space usage and an increase in disk I/O. Optionally takes a list of column family names.
My questions are:
When will a node having keys not belonging to it?
When you have added new nodes to the cluster, decreased replication factor or moved tokens.
When should I issue a cleanup?
After one of the above operations, if you need to save disk space. There is no harm in delaying running it - there is a performance impact and the only reason to is to save disk space.
Should I do cleanup regularly (e.g. once per week)?
No, only if you need to save space after one of the above operations.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With