Cassandra file structure - how are the files used?

Tags:

When experimenting with Cassandra I've observed that Cassandra writes to the following files:

/.../cassandra/commitlog/CommitLog-<id>.log
/.../cassandra/data/Keyspace1/Standard1-1-Data.db
/.../cassandra/data/Keyspace1/Standard1-1-Filter.db
/.../cassandra/data/Keyspace1/Standard1-1-Index.db
/.../cassandra/data/system/LocationInfo-1-Data.db
/.../cassandra/data/system/LocationInfo-1-Filter.db
/.../cassandra/data/system/LocationInfo-1-Index.db
/.../cassandra/data/system/LocationInfo-2-Data.db
/.../cassandra/data/system/LocationInfo-2-Filter.db
/.../cassandra/data/system/LocationInfo-2-Index.db
/.../cassandra/data/system/LocationInfo-3-Data.db
/.../cassandra/data/system/LocationInfo-3-Filter.db
/.../cassandra/data/system/LocationInfo-3-Index.db
/.../cassandra/system.log

The general structure seems to be:

/.../cassandra/commitlog/CommitLog-ID.log
/.../cassandra/data/KEYSPACE/COLUMN_FAMILY-N-Data.db
/.../cassandra/data/KEYSPACE/COLUMN_FAMILY-N-Filter.db
/.../cassandra/data/KEYSPACE/COLUMN_FAMILY-N-Index.db
/.../cassandra/system.log

What is the Cassandra file structure? More specifically, how are the data, commitlog directories used, and what is the structure of the files in the data directory (Data/Filter/Index)?

616

asked Mar 01 '10 21:03

knorv

2 Answers

A write to a Cassandra node first hits the CommitLog (sequential). (Then Cassandra stores values to column-family specific, in-memory data structures called Memtables. The Memtables are flushed to disk whenever one of the configurable thresholds is exceeded. (1, datasize in memtable. 2, # of objects reach certain limit, 3, lifetime of a memtable expires.))

The data folder contains a subfolder for each keyspace. Each subfolder contains three kind of files:

Data files: An SSTable (nomenclature borrowed from Google) stands for Sorted Strings Table and is a file of key-value string pairs (sorted by keys).
Index file: (Key, offset) pairs (points into data file)
Bloom filter: all keys in data file

answered Oct 23 '22 09:10

Schildmeijer

Cassandra File Format in detail

Each ColumnFamily(Eg. object) in separated sstable files

ColumnFamilyName-version-#-Data.db
ColumnFamilyName-version-#-Index.db
ColumnFamilyName-version-#-Filter.db

enter image description here

answered Oct 23 '22 11:10

leef

Related questions
                            
                                Case Statements versus coded if statements
                            
                                where is the NetBeans config file (netbeans.conf) located?
                            
                                Signal Handling in C
                            
                                Are variable length arrays possible with Javascript
                            
                                Text -> Diagram Tool [closed]
                            
                                How to add two CSS Class to control in the code behind?
                            
                                How would I go about licensing a WPF windows application [closed]
                            
                                Indexable interface
                            
                                Implementing IF Condition Within a T-SQL UPDATE Statement
                            
                                Find the length of the longest row in a column in oracle
                            
                                How has C++ changed in the past decade? [closed]
                            
                                SetupDiGetDeviceProperty usage example

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With