Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apache Cassandra Data Storage on Disk

Tags:

cassandra

Is Cassandra's data stored only in the /var/lib/cassandra folder as mentioned in the cassandra.yaml file?

Or is there any other location where Cassandra data is stored?

like image 809
Mahesh Gupta Avatar asked Mar 09 '12 09:03

Mahesh Gupta


People also ask

How is Cassandra data stored in disk?

When a write occurs, Cassandra stores the data in a memory structure called memtable, and to provide configurable durability, it also appends writes to the commit log on disk. The commit log receives every write made to a Cassandra node, and these durable writes survive permanently even if power fails on a node.

Is Cassandra a memory or disk?

Limitations of Cassandra A fundamental limitation of Cassandra is that it is disk-based, not an in-memory database. This means that read performance is always capped by I/O specifications, ultimately restricting application performance and limiting the ability to attain an acceptable user experience.

How much data can you store in Cassandra?

Maximum recommended capacity for Cassandra 1.2 and later is 3 to 5TB per node for uncompressed data. For Cassandra 1.1, it is 500 to 800GB per node.

Can Cassandra store files?

Cassandra was never designed to manage file or object storage metadata and it is predictably weak in this regard. It is not ACID compliant. It does not have the rigidity to prevent partially successful writes, dupes, contradictions and the like.


1 Answers

You can change the data storage location in the cassandra.yaml file, if you don't want data stored in /var/lib. See DataStax's Guide for Configuring Cassandra for a full explanation of the config file. In particular,

> commitlog_directory

The directory where the commit log will be stored. For optimal write performance, DataStax recommends the commit log be on a separate disk partition (ideally a separate physical device) from the data file directories.

> data_file_directories
The directory location where column family data (SSTables) will be stored.

They do recommend you put the commit log one disk and the actual data on a second disk to avoid running out of space.

like image 128
FloppyDisk Avatar answered Oct 05 '22 15:10

FloppyDisk