So if I have the following data in Bigtable:
DEL_6878 .....
DEL_6879 .....
BOM_5876 .....
SFO_8686 .....
SFO_8687 .....
How do I query for say SFO* records? I read the documentation; I know how to get single row; something similar to this:
table.get("SFO_8686");
Or how to get a range; something like getRows("SFO_8686", "SFO _8687")
which takes in startKey
and endKey
, but I read in the documentation and was made to believe that one can get records that start with prefix; the SFO* example. How do I do that?
In my experience, PrefixFilter works well for partial row-key *-style scans and from what I managed to dig out, setting the start and end rows in addition to that should improve performance (presumably by avoiding the full scan):
PrefixFilter px = new PrefixFilter(Bytes.toBytes(rowKey));
Scan s = new Scan();
s.setStartRow(Bytes.toBytes(rowKey));
s.setFilter(px);
...
Also, from what I understand from this discussion:
HBase (Easy): How to Perform Range Prefix Scan in hbase shell
, is in the shell environment, the 'ROWPREFIXFILTER' is meant to combine the two elements above:
scan 'TableName', {ROWPREFIXFILTER => 'SFO'}
But I have not managed to find a java-equivalent of that, if that's what you are after. Would be helpful to hear if others have!
I would think that running a Scan with a range is your most efficient option. You can also use a scan with org.apache.hadoop.hbase.filter.RowFilter
with a RegexStringComparator
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With