I cannot find documentation on the "compactionstats":
While using nodetool compactionstats
, what do the numerical values on the completed
and total
columns mean?
My column family has a total data size of about 360 GB but my compaction status displays:
pending tasks: 7
compaction type keyspace column family completed total unit progress
Compaction Test Message 161257707087 2475323941809 bytes 6.51%
While I see the "completed" increasing slowly (also the progress;-).
But how is this "total" computed? Why is it 2.5 TB when I have only 360 GB of data?
You must have compression on. total
is the total number of uncompressed bytes comprising the set of sstables that are being compacted together. If you grep the cassandra log file for lines containing Compacting
you will find the sstables that are part of a compaction. If you sum these sizes and multiply by the inverse of your compression ratio for the column family you will get pretty close to the total. By default this can be a bit difficult to verify on a multi-core system because the number of simultaneous compactions defaults to the number of cores.
You can also verify this answer by looking at the code:
AbstractionCompactionIterable - getCompactionInfo()
uses the bytesRead
and totalBytes
fields from that class. totalBytes
is final and is computed in the constructor, by summing getLengthInBytes()
from each file that is part of the compaction.
The scanners vary, but the length in bytes returned by CompressedRandomAccessReader is the uncompressed size of the file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With