Hbase FuzzyRowFilter how jumping of keys work

Question

I know that fuzzy row filter takes two parameters first being row key and second being fuzzy logic. What i understood from the corresponding java class FuzzyRowFilter is, the filter evaluates the current row and try to compute the next higher row key that will match the fuzzy logic and it jumps the non matching keys.

I am unable to understand following things

How scan jumps certain row keys? Does it use Get to get and compare the current row key. How scan get to know where the next matching row key exists? without doing a full scan(if it jumps)

Igor Katkov · Accepted Answer

You understood everything correctly.

For those who came here from web-search here are two links that explains how row skipping can be leveraged in general and how it's done in FuzzyRowFilter in particular

HBase FuzzyRowFilter: Alternative to Secondary Indexes
Filters in HBase (or intra row scanning part II)

If a filter knows it's at the last key and needs a skip:

Filter returns SEEK_NEXT_USING_HINT
Region Server calls getNextCellHint which returns a suggested Cell
Region Server performs exactly same routine of finding a key as it did for the first key in scan - it examines available HFiles checking if the key in question is there
1. Region Server reads the "trailer" section of each file to get offsets of metadatablocks
2. Region Server reads Meta and FileInfo metadata block types to avoid reading the binary data from the hfile if there’s no chance that the key is present (Bloom Filter), if the file is too old (Max SequenceId) or if the file is too new (Timerange) to contain what we’re looking for. See more about HFile format here
3. Should the key be inside the HFile, Region Server uses DataBlock index segments to compute offset of to the location of the datablock with has the key in question
4. if the datablock with the key happens already be in the Region Server block cache, next step is skipped
5. Datablock is read from HFile
6. Region Server finally scans keys, one-by-one until it hits the target one
The found key, and potentially whole row (depending on the filter), is passed to the filter code
Whole cycle repeats

Hbase FuzzyRowFilter how jumping of keys work

Tags:

hbase

bigdata

hfile

Vikram Singh Chandel

1 Answers

Igor Katkov

Recent Activity

Donate For Us

Hbase FuzzyRowFilter how jumping of keys work

Tags:

hbase

bigdata

hfile

Vikram Singh Chandel

1 Answers

Igor Katkov

Related questions

Recent Activity

Donate For Us