I have HBase table with row keys, which consist of text ID and timestamp, like next:
...
string_id1.1470913344067
string_id1.1470913345067
string_id2.1470913344067
string_id2.1470913345067
...
How can I filter Scan of HBase (in Scala or Java) to get results with some string ID and timestamp more than some value?
Thanks
Fuzzy row approach is efficient for this kind of requirement and when data is is huge : As explained by this article FuzzyRowFilter takes as parameters row key and a mask info.
In example above, in case we want to find last logged in users and row key format is userId_actionId_timestamp
(where userId
has fixed length of say 4 chars), the fuzzy row key we are looking for is ????_login_
. This translates into the following params for FuzzyRowKey:
FuzzyRowFilter rowFilter = new FuzzyRowFilter(
Arrays.asList(
new Pair<byte[], byte[]>(
Bytes.toBytesBinary("\x00\x00\x00\x00_login_"),
new byte[] {1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0})));
Would suggest to go through hbase-the-definitive guide -->Client API: Advanced Features
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With