Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How filter Scan of HBase by part of row key?

I have HBase table with row keys, which consist of text ID and timestamp, like next:

...
string_id1.1470913344067
string_id1.1470913345067
string_id2.1470913344067
string_id2.1470913345067
...

How can I filter Scan of HBase (in Scala or Java) to get results with some string ID and timestamp more than some value?

Thanks

like image 565
Vital Yeutukhovich Avatar asked Aug 11 '16 12:08

Vital Yeutukhovich


1 Answers

Fuzzy row approach is efficient for this kind of requirement and when data is is huge : As explained by this article FuzzyRowFilter takes as parameters row key and a mask info.

In example above, in case we want to find last logged in users and row key format is userId_actionId_timestamp (where userId has fixed length of say 4 chars), the fuzzy row key we are looking for is ????_login_. This translates into the following params for FuzzyRowKey:

FuzzyRowFilter rowFilter = new FuzzyRowFilter(
 Arrays.asList(
  new Pair<byte[], byte[]>(
    Bytes.toBytesBinary("\x00\x00\x00\x00_login_"),
    new byte[] {1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0})));

Would suggest to go through hbase-the-definitive guide -->Client API: Advanced Features

like image 91
Ram Ghadiyaram Avatar answered Oct 16 '22 22:10

Ram Ghadiyaram