Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does rows_merged mean in compactionhistory?

When I issue

$ nodetool compactionhistory

I get

. . . compacted_at        bytes_in       bytes_out      rows_merged
. . . 1404936947592       8096           7211           {1:3, 3:1}

What does {1:3, 3:1} mean? The only documentation I can find is this which states

the number of partitions merged

which does not explain why multiple values and what the colon means.

like image 203
Ztyx Avatar asked Dec 19 '14 12:12

Ztyx


1 Answers

So basically it means {tables:rows} for example {1:3, 3:1} means 3 rows were taken from one sstable (1:3) and 1 row taken from 3 (3:1) sstables, all to make the one sstable in that compaction operation.

I tried it out myself so here's an example, I hope this helps:

create keyspace and table:

cqlsh> create keyspace space1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};

cqlsh> create TABLE space1.tb1 ( key text, val1 text, primary KEY (key));

cqlsh> INSERT INTO space1.tb1 (key, val1 ) VALUES ( 'key1','111');
cqlsh> INSERT INTO space1.tb1 (key, val1 ) VALUES ( 'key2','222');
cqlsh> INSERT INTO space1.tb1 (key, val1 ) VALUES ( 'key3','333');
cqlsh> INSERT INTO space1.tb1 (key, val1 ) VALUES ( 'key4','444');
cqlsh> INSERT INTO space1.tb1 (key, val1 ) VALUES ( 'key5','555');
cqlsh> exit

Now we flush to create the sstable

$ nodetool flush space1

We see that only one version of the table is created

$ sudo ls -lR /var/lib/cassandra/data/space1

/var/lib/cassandra/data/space1:
total 4
drwxr-xr-x. 2 cassandra cassandra 4096 Feb  3 12:51 tb1

/var/lib/cassandra/data/space1/tb1:
total 32
-rw-r--r--. 1 cassandra cassandra   43 Feb  3 12:51 space1-tb1-jb-1-CompressionInfo.db
-rw-r--r--. 1 cassandra cassandra  146 Feb  3 12:51 space1-tb1-jb-1-Data.db
-rw-r--r--. 1 cassandra cassandra   24 Feb  3 12:51 space1-tb1-jb-1-Filter.db
-rw-r--r--. 1 cassandra cassandra   90 Feb  3 12:51 space1-tb1-jb-1-Index.db
-rw-r--r--. 1 cassandra cassandra 4389 Feb  3 12:51 space1-tb1-jb-1-Statistics.db
-rw-r--r--. 1 cassandra cassandra   80 Feb  3 12:51 space1-tb1-jb-1-Summary.db
-rw-r--r--. 1 cassandra cassandra   79 Feb  3 12:51 space1-tb1-jb-1-TOC.txt

check the sstable2json we see our data

$ sudo -u cassandra /usr/bin/sstable2json /var/lib/cassandra/data/space1/tb1/space1-tb1-jb-1-Data.db
[
{"key": "6b657935","columns": [["","",1422967847005000], ["val1","555",1422967847005000]]},
{"key": "6b657931","columns": [["","",1422967817740000], ["val1","111",1422967817740000]]},
{"key": "6b657934","columns": [["","",1422967840622000], ["val1","444",1422967840622000]]},
{"key": "6b657933","columns": [["","",1422967832341000], ["val1","333",1422967832341000]]},
{"key": "6b657932","columns": [["","",1422967825116000], ["val1","222",1422967825116000]]}
]

At this point ‘notetool compactionhistory’ shows nothing for this table but lets run compact anyway to see what we get (scroll right)

$ nodetool compactionhistory | awk 'NR == 2 || /space1/'
id                                       keyspace_name      columnfamily_name            compacted_at              bytes_in       bytes_out      rows_merged
5725f890-aba4-11e4-9f73-351725b0ac5b     space1             tb1                          1422968305305             146            146            {1:5}

Now lets delete two rows, and flush

cqlsh> delete from space1.tb1 where key='key1';
cqlsh> delete from space1.tb1 where key='key2';
cqlsh> exit

$ nodetool flush space1

$ sudo ls -l /var/lib/cassandra/data/space1/tb1/
[sudo] password for datastax: 
total 64
-rw-r--r--. 1 cassandra cassandra   43 Feb  3 12:58 space1-tb1-jb-2-CompressionInfo.db
-rw-r--r--. 1 cassandra cassandra  146 Feb  3 12:58 space1-tb1-jb-2-Data.db
-rw-r--r--. 1 cassandra cassandra  336 Feb  3 12:58 space1-tb1-jb-2-Filter.db
-rw-r--r--. 1 cassandra cassandra   90 Feb  3 12:58 space1-tb1-jb-2-Index.db
-rw-r--r--. 1 cassandra cassandra 4393 Feb  3 12:58 space1-tb1-jb-2-Statistics.db
-rw-r--r--. 1 cassandra cassandra   80 Feb  3 12:58 space1-tb1-jb-2-Summary.db
-rw-r--r--. 1 cassandra cassandra   79 Feb  3 12:58 space1-tb1-jb-2-TOC.txt
-rw-r--r--. 1 cassandra cassandra   43 Feb  3 13:02 space1-tb1-jb-3-CompressionInfo.db
-rw-r--r--. 1 cassandra cassandra   49 Feb  3 13:02 space1-tb1-jb-3-Data.db
-rw-r--r--. 1 cassandra cassandra   16 Feb  3 13:02 space1-tb1-jb-3-Filter.db
-rw-r--r--. 1 cassandra cassandra   36 Feb  3 13:02 space1-tb1-jb-3-Index.db
-rw-r--r--. 1 cassandra cassandra 4413 Feb  3 13:02 space1-tb1-jb-3-Statistics.db
-rw-r--r--. 1 cassandra cassandra   80 Feb  3 13:02 space1-tb1-jb-3-Summary.db
-rw-r--r--. 1 cassandra cassandra   79 Feb  3 13:02 space1-tb1-jb-3-TOC.txt

Lets check the tables contents

$ sudo -u cassandra /usr/bin/sstable2json /var/lib/cassandra/data/space1/tb1/space1-tb1-jb-2-Data.db
[
{"key": "6b657935","columns": [["","",1422967847005000], ["val1","555",1422967847005000]]},
{"key": "6b657931","columns": [["","",1422967817740000], ["val1","111",1422967817740000]]},
{"key": "6b657934","columns": [["","",1422967840622000], ["val1","444",1422967840622000]]},
{"key": "6b657933","columns": [["","",1422967832341000], ["val1","333",1422967832341000]]},
{"key": "6b657932","columns": [["","",1422967825116000], ["val1","222",1422967825116000]]}
]

$ sudo -u cassandra /usr/bin/sstable2json /var/lib/cassandra/data/space1/tb1/space1-tb1-jb-3-Data.db
[
{"key": "6b657931","metadata": {"deletionInfo": {"markedForDeleteAt":1422968551313000,"localDeletionTime":1422968551}},"columns": []},
{"key": "6b657932","metadata": {"deletionInfo": {"markedForDeleteAt":1422968553322000,"localDeletionTime":1422968553}},"columns": []}
]

Now lets compact

$ nodetool compact space1

Only one stable now as expected

$ sudo ls -l /var/lib/cassandra/data/space1/tb1/
total 32
-rw-r--r--. 1 cassandra cassandra   43 Feb  3 13:05 space1-tb1-jb-4-CompressionInfo.db
-rw-r--r--. 1 cassandra cassandra  133 Feb  3 13:05 space1-tb1-jb-4-Data.db
-rw-r--r--. 1 cassandra cassandra  656 Feb  3 13:05 space1-tb1-jb-4-Filter.db
-rw-r--r--. 1 cassandra cassandra   90 Feb  3 13:05 space1-tb1-jb-4-Index.db
-rw-r--r--. 1 cassandra cassandra 4429 Feb  3 13:05 space1-tb1-jb-4-Statistics.db
-rw-r--r--. 1 cassandra cassandra   80 Feb  3 13:05 space1-tb1-jb-4-Summary.db
-rw-r--r--. 1 cassandra cassandra   79 Feb  3 13:05 space1-tb1-jb-4-TOC.txt

Now lets check the contents of the new stable we can see the tombstones

$ sudo -u cassandra /usr/bin/sstable2json /var/lib/cassandra/data/space1/tb1/space1-tb1-jb-4-Data.db
[
{"key": "6b657935","columns": [["","",1422967847005000], ["val1","555",1422967847005000]]},
{"key": "6b657931","metadata": {"deletionInfo": {"markedForDeleteAt":1422968551313000,"localDeletionTime":1422968551}},"columns": []},
{"key": "6b657934","columns": [["","",1422967840622000], ["val1","444",1422967840622000]]},
{"key": "6b657933","columns": [["","",1422967832341000], ["val1","333",1422967832341000]]},
{"key": "6b657932","metadata": {"deletionInfo": {"markedForDeleteAt":1422968553322000,"localDeletionTime":1422968553}},"columns": []}
]

Finally lets check compaction history (scroll right)

$ nodetool compactionhistory | awk 'NR == 2 || /space1/'
id                                       keyspace_name      columnfamily_name            compacted_at              bytes_in       bytes_out      rows_merged
5725f890-aba4-11e4-9f73-351725b0ac5b     space1             tb1                          1422968305305             146            146            {1:5}
46112600-aba5-11e4-9f73-351725b0ac5b     space1             tb1                          1422968706144             195            133            {1:3, 2:2}
like image 112
markc Avatar answered Oct 09 '22 00:10

markc