Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google Cloud Bigtable: query partial keys

So if I have the following data in Bigtable:

DEL_6878 .....
DEL_6879 .....
BOM_5876 .....
SFO_8686 .....
SFO_8687 .....

How do I query for say SFO* records? I read the documentation; I know how to get single row; something similar to this:

table.get("SFO_8686");

Or how to get a range; something like getRows("SFO_8686", "SFO _8687") which takes in startKey and endKey, but I read in the documentation and was made to believe that one can get records that start with prefix; the SFO* example. How do I do that?

like image 825
Amit Avatar asked Jul 20 '16 12:07

Amit


2 Answers

In my experience, PrefixFilter works well for partial row-key *-style scans and from what I managed to dig out, setting the start and end rows in addition to that should improve performance (presumably by avoiding the full scan):

PrefixFilter px = new PrefixFilter(Bytes.toBytes(rowKey));
Scan s = new Scan();
s.setStartRow(Bytes.toBytes(rowKey));
s.setFilter(px);
...

Also, from what I understand from this discussion: HBase (Easy): How to Perform Range Prefix Scan in hbase shell , is in the shell environment, the 'ROWPREFIXFILTER' is meant to combine the two elements above:

scan 'TableName', {ROWPREFIXFILTER => 'SFO'}

But I have not managed to find a java-equivalent of that, if that's what you are after. Would be helpful to hear if others have!

like image 105
VS_FF Avatar answered Sep 19 '22 18:09

VS_FF


I would think that running a Scan with a range is your most efficient option. You can also use a scan with org.apache.hadoop.hbase.filter.RowFilter with a RegexStringComparator.

like image 41
Solomon Duskis Avatar answered Sep 18 '22 18:09

Solomon Duskis