Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cassandra LeveledCompactionStrategy and high SSTable number per read

We are using cassandra 2.0.17 and we have a table with 50% selects, 40% of updates and 10% of inserts (no deletes).

To have high read performance for such table we found that it is suggested to use LeveledCompactionStrategy (it is supposed to guarantee that 99% of reads will be fulfilled from single SSTable). Every day when I run nodetool cfhistograms i see more and more SSTtables per read. First day we had 1, than we had 1,2,3 ...
and this morning I am seeing this:

ubuntu@ip:~$ nodetool cfhistograms prodb groups | head -n 20                                                                                                                                
prodb/groups histograms

SSTables per Read
1 sstables: 27007
2 sstables: 97694
3 sstables: 95239
4 sstables: 3928
5 sstables: 14
6 sstables: 0
7 sstables: 19

The describe groups returns this:

CREATE TABLE groups (
  ...
) WITH
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.100000 AND
  gc_grace_seconds=172800 AND
  index_interval=128 AND
  read_repair_chance=0.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='99.0PERCENTILE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor'};

Is it normal? In such case we loose the advantage of using LeveledCompaction which as described in the documentation should guarantee 99% of reads from single sstable.

like image 776
Jakub Troszok Avatar asked Oct 05 '16 08:10

Jakub Troszok


People also ask

What is SSTable in Cassandra?

SSTables are the immutable data files that Cassandra uses for persisting data on disk. As SSTables are flushed to disk from memtables or are streamed from other nodes, Cassandra triggers compactions which combine multiple SSTables into one. Once the new SSTable has been written, the old SSTables can be removed.

What does SSTable stand for?

Sorted Strings Table (SSTable) is a persistent file format used by ScyllaDB, Apache Cassandra, and other NoSQL databases to take the in-memory data stored in memtables, order it for fast access, and store it on disk in a persistent, ordered, immutable set of files.

How does Cassandra compaction work?

Cassandra Compaction is a process of reconciling various copies of data spread across distinct SSTables. Cassandra performs compaction of SSTables as a background activity. Cassandra has to maintain fewer SSTables and fewer copies of each data row due to compactions improving its read performance.

What is leveled compaction?

Leveled compaction creates sstables of a fixed, relatively small size (5MB by default in Cassandra's implementation), that are grouped into "levels." Within each level, sstables are guaranteed to be non-overlapping. Each level is ten times as large as the previous.


1 Answers

It does depend on the usecase - but as a rule of thumb I normally look at LCS for 90% read to 10% write ratio. From your description you're looking at 50/50 at best.

The additional compaction demands placed by LCS makes it pretty io hungry. It's highly likely that compaction is backed up and your levels are not balanced. The easiest way to tell is to run nodetool cfstats for the table in question.

You're looking for the line:

SSTables in each level: [2042/4, 10, 119/100, 232, 0, 0, 0, 0, 0]

The numbers in the square brackets shows how many sstables are in each level. [L0, L1, L2 ...]. The number after the slash is the ideal level. As a rule of thumb L1 should be 10, L2 100, L3 1000 etc.

New sstables go in at L0 and then gradually move up. You can see the above example is in a really bad state. We've still got 2000 sstables to process more than exists in all other levels. The performance here will be massively worse than if I'd just used STCS.

Nodetool cfstats makes it pretty easy to measure if LCS is keeping up with your usecase. Just dump out the above every 15 minutes throughout the day. Any time your levels are unbalanced the read performance will suffer. If it's constantly behind you probably want to switch to STCS. If it spikes for say 10 minutes when you data load but the rest of the day is fine - then you may decide to live with it. If it never goes out of balance - stick with LCS - it's totally working for you.

As a side note - 2.1 allows L0 to carry out STCS style merging which will help in the situation where you have a temporary spike. If you're in the ten minute scenario above - it's almost certainly worth an upgrade.

like image 84
Nom de plume Avatar answered Nov 17 '22 03:11

Nom de plume