Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding Cassandra's storage overhead

I have been reading this section of the Cassandra docs and found the following a little puzzling:

Determine column overhead:

regular_total_column_size = column_name_size + column_value_size + 15

counter - expiring_total_column_size = column_name_size + column_value_size + 23

Every column in Cassandra incurs 15 bytes of overhead. Since each row in a table can have different column names as well as differing numbers of columns, metadata is stored for each column. For counter columns and expiring columns, you should add an additional 8 bytes (23 bytes total).

The way I interpret the above for a CQL3 defined schema such as:

CREATE TABLE mykeyspace.mytable(
  id text,
  report_id text,
  subset_id text,
  report_date timestamp,
  start_date timestamp,
  end_date timestamp,
  subset_descr text,
  x int,
  y double,
  z int,
  PRIMARY KEY (id, report_id, subset_id)
);

is that each row will contain the metadata for the column names, e.g., the strings report_date, start_date, end_date, etc. and their type along with the data. However, it's not clear to me what it means that each row in a table can have different column names. This sounds wrong to me given the schema above is totally static, i.e., Cassandra 2.0 will most certainly complain if I try to write:

INSERT INTO mykeyspace.mytable (id, report_id , subset_id, x, y, z, w) 
VALUES ( 'asd','qwe','rty',100,1.234,12, 123.123);

Bad Request: Unknown identifier w

Now it looks to me like column names are fixed given this table schema and thus the metadata should not need to be stored per each row. I am guessing either the phrasing in the documentation is outdated (it's the same as Cassandra 1.2) or I'm misunderstanding some core concept at work here.

Can anybody clarify? Bottom line: do I have to worry about the length of the names of my columns or not?

We have been playing it safe and used single character names where possible (so the above columns would actually be i, r, s, dr, ds, de, sd, ...), but it's so non human unreadable and can be confusing to work with.

like image 966
Giovanni Botta Avatar asked May 30 '14 15:05

Giovanni Botta


1 Answers

The easiest way to figure out what is going on in situations like this is to check the sstable2json (cassandra/bin) representation of your data. This will show you what ends up actually be saved on disk.

Here is the example for your situation

 [
 {"key": "4b6579","columns": [
       ["rid1:ssid1:","",1401469033325000],
       ["rid1:ssid1:end_date","2004-10-03 00:00:00-0700",1401469033325000],
       ["rid1:ssid1:report_date","2004-10-03 00:00:00-0700",1401469033325000],
       ["rid1:ssid1:start_date","2004-10-03 00:00:00-0700",1401469033325000], 
       ["rid1:ssid1:subset_descr","descr",1401469033325000],
       ["rid1:ssid1:x","1",1401469033325000], 
       ["rid1:ssid1:y","5.5",1401469033325000],
       ["rid1:ssid1:z","1",1401469033325000],
       ["rid2:ssid2:","",1401469938599000],
       ["rid2:ssid2:end_date", "2004-10-03 00:00:00-0700",1401469938599000],
       ["rid2:ssid2:report_date","2004-10-03 00:00:00-0700",1401469938599000],
       ["rid2:ssid2:start_date","2004-10-03 00:00:00-0700",1401469938599000], 
       ["rid2:ssid2:subset_descr","descr",1401469938599000],
       ["rid2:ssid2:x","1",1401469938599000],
       ["rid2:ssid2:y","5.5",1401469938599000],
       ["rid2:ssid2:z","1",1401469938599000]
 }
 ]

The value of the partition key is saved once per partition (per sstable) as you can see above, the column name in this case doesn't matter at all since it is implicit given the table. The column names for the clustering columns are also not present because with C* you aren't allowed to insert without specifying all portions of the key.

Whats left though does have the column name, this is needed incase a partial update to a row is made so it can be saved without the rest of the row information. You could imagine an update to a single column field in a row, to indicate which field this is C* currently uses the column name but there are tickets to change this to a smaller representation. https://issues.apache.org/jira/browse/CASSANDRA-4175

To generate this

cqlsh
CREATE TABLE mykeyspace.mytable(   id text,   report_id text,   subset_id text,   report_date timestamp,   start_date timestamp,   end_date timestamp,   subset_descr text,   x int,   y double,   z int,   PRIMARY KEY (id, report_id, subset_id) );
INSERT INTO mykeyspace.mytable (id, report_id , subset_id , report_date , start_date , end_date , subset_descr ,x, y, z) VALUES ( 'Key', 'rid1','ssid1', '2004-10-03','2004-10-03','2004-10-03','descr',1,5.5,1);
INSERT INTO mykeyspace.mytable (id, report_id , subset_id , report_date , start_date , end_date , subset_descr ,x, y, z) VALUES ( 'Key', 'rid2','ssid2', '2004-10-03','2004-10-03','2004-10-03','descr',1,5.5,1);
exit;
nodetool flush
bin/sstable2json $DATA_DIR/mytable/mykeyspace-mytable-jb-1-Data.db 
like image 159
RussS Avatar answered Oct 15 '22 14:10

RussS