Is Cassandra's data stored only in the /var/lib/cassandra
folder as mentioned in the cassandra.yaml
file?
Or is there any other location where Cassandra data is stored?
When a write occurs, Cassandra stores the data in a memory structure called memtable, and to provide configurable durability, it also appends writes to the commit log on disk. The commit log receives every write made to a Cassandra node, and these durable writes survive permanently even if power fails on a node.
Limitations of Cassandra A fundamental limitation of Cassandra is that it is disk-based, not an in-memory database. This means that read performance is always capped by I/O specifications, ultimately restricting application performance and limiting the ability to attain an acceptable user experience.
Maximum recommended capacity for Cassandra 1.2 and later is 3 to 5TB per node for uncompressed data. For Cassandra 1.1, it is 500 to 800GB per node.
Cassandra was never designed to manage file or object storage metadata and it is predictably weak in this regard. It is not ACID compliant. It does not have the rigidity to prevent partially successful writes, dupes, contradictions and the like.
You can change the data storage location in the cassandra.yaml
file, if you don't want data stored in /var/lib
. See DataStax's Guide for Configuring Cassandra for a full explanation of the config file. In particular,
> commitlog_directory
The directory where the commit log will be stored. For optimal write performance, DataStax recommends the commit log be on a separate disk partition (ideally a separate physical device) from the data file directories.
> data_file_directories
The directory location where column family data (SSTables) will be stored.
They do recommend you put the commit log one disk and the actual data on a second disk to avoid running out of space.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With