The documentation for nodetool compact says:
This command starts the compaction process on tables that use the SizeTieredCompactionStrategy and DateTieredCompactionStrategy. You can specify a keyspace for compaction.
But what it does for DateTieredCompactionStrategy?
Side question: what is -s, --split-output
parameter? It is explained as: Use -s to not create a single big file
. I'm confused - isn't that the purpose of nodetool compact
?
Forces a major compaction on one or more tables. Forces a major compaction on one or more tables.
Drains the node. Flushes all memtables from the node to SSTables on disk. DSE stops listening for connections from the client and other nodes.
Cassandra Compaction is a process of reconciling various copies of data spread across distinct SSTables. Cassandra performs compaction of SSTables as a background activity. Cassandra has to maintain fewer SSTables and fewer copies of each data row due to compactions improving its read performance.
Repairs one or more tables. The repair command repairs one or more nodes in a cluster, and provides options for restricting repair to a set of nodes, see Repairing nodes. Performing an anti-entropy node repair on a regular basis is important, especially in an environment that deletes data frequently.
Nodetool compact with no flags will still create a big single file even with DTCS.
The -s, --split-output option is only there starting with c* 2.2 and beyond.
The news.txt states:
+ It is also possible to split output when doing a major compaction with
+ STCS - files will be split in sizes 50%, 25%, 12.5% etc of the total size.
+ This might be a bit better than old major compactions which created one big
+ file on disk.
On DTCS -s won't do anything special (will still create one large sstable)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With