Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is role of bloom filter in cassandra?

From two different links of the Cassandra's documentation, I found:

link 1

A structure stored in memory that checks if row data exists in the memtable before accessing SSTables on disk

and

link2

Cassandra checks the Bloom filter to discover which SSTables are likely to have the request partition data.

My question is does both the above statements are right? If yes, does bloom filters maintained for a Memtable and SSTable separately? Thanks in advance.

like image 913
Mayank Raghav Avatar asked Sep 05 '16 09:09

Mayank Raghav


2 Answers

A Bloom filter is a generic data structure used to check if an element is present in a set or not. Its algorithm is designed to be extremely fast, at the cost of risking to return false positives.

Cassandra uses bloom filters to test if any of the SSTables is likely to contain the requested partition key or not, without actually having to read their contents (and thus avoiding expensive IO operations).

If a bloom filter returns false for a given partition key, then it is absolutely certain that the partition key is not present in the corresponding SSTable; if it returns true, however, then the SSTable is likely to contain the partition key. When this happens, Cassandra will resort to more sophisticated techniques to determine if it needs to read that SSTable or not. Note that bloom filters are consulted for most reads, and updated only during some writes (when a memtable is flushed to disk). You can read more about Cassandra's read path here.

Back to your questions:

1) The first statement ("A structure stored in memory that checks if row data exists in the memtable before accessing SSTables on disk") is IMHO not accurate: bloom filters are indeed updated when a memtable is flushed to disk, but they do not reference the memtable.

2) Bloom filters are maintained per SSTable, i.e. each SSTable on disk gets a corresponding bloom filter in memory.

like image 110
adutra Avatar answered Sep 28 '22 08:09

adutra


In the read path, Cassandra merges data on disk (in SSTables) with data in RAM (in memtables). To avoid checking every SSTable data file for the partition being requested, Cassandra employs a data structure known as a bloom filter.

Bloom filters are a probabilistic data structure that allows Cassandra to determine one of two possible states: - The data definitely does not exist in the given file, or - The data probably exists in the given file.

While bloom filters can not guarantee that the data exists in a given SSTable, bloom filters can be made more accurate by allowing them to consume more RAM. Operators have the opportunity to tune this behavior per table by adjusting the the bloom_filter_fp_chance to a float between 0 and 1.

The default value for bloom_filter_fp_chance is 0.1 for tables using LeveledCompactionStrategy and 0.01 for all other cases.

Bloom filters are stored in RAM, but are stored offheap, so operators should not consider bloom filters when selecting the maximum heap size. As accuracy improves (as the bloom_filter_fp_chance gets closer to 0), memory usage increases non-linearly - the bloom filter for bloom_filter_fp_chance = 0.01 will require about three times as much memory as the same table with bloom_filter_fp_chance = 0.1.

Typical values for bloom_filter_fp_chance are usually between 0.01 (1%) to 0.1 (10%) false-positive chance, where Cassandra may scan an SSTable for a row, only to find that it does not exist on the disk. The parameter should be tuned by use case:

  1. Users with more RAM and slower disks may benefit from setting the bloom_filter_fp_chance to a numerically lower number (such as 0.01) to avoid excess IO operations.

  2. Users with less RAM, more dense nodes, or very fast disks may tolerate a higher bloom_filter_fp_chance in order to save RAM at the expense of excess IO operations

  3. In workloads that rarely read, or that only perform reads by scanning the entire data set (such as analytics workloads), setting the bloom_filter_fp_chance to a much higher number is acceptable.

like image 36
Phoenix Avatar answered Sep 28 '22 07:09

Phoenix