Get HBase Row Keys in Range without Retrieving Data?

Question

Is there a way to retrieve the row keys in a given range without actually retrieving the columns/CFs associated with that row key?

For clarification: In my example, our table's row keys are stock ticker names (e.g. GOOG), and in our web app we'd like to populate an autocomplete widget using just the row keys we have in the database. Obviously, if we retrieve all the data (instead of only the stock names) for all the stocks between G and H when a user types 'G', we'll be unnecessarily straining our system. Any ideas?

Steve DeNeefe · Accepted Answer

According to the official documentation, you can optimally retrieve only the row keys using a combination of two filters: the KeyOnlyFilter and the FirstKeyOnlyFilter. (I think the "FirstKeyOnlyFilter" will return the key only once, even with large, complex rows.) If you only want keys in a given range, you can add that range to the scanner.

Here is some example code:

FilterList filters = new FilterList(FilterList.Operator.MUST_PASS_ALL,
            new FirstKeyOnlyFilter(),
            new KeyOnlyFilter());
Scan s = new Scan(filters);
// in order to limit the scan to a range
s.setStartRow(startRowKey);  // first key in range
s.setStopRow(stopRowKey);    // key value after the last key in the range

Source: https://hbase.apache.org/book.html#perf.hbase.client.rowkeyonly

divadpoc · Answer

take a look at the filters (http://hbase.apache.org/book/client.filter.html), especially KeyOnlyFilter. the description of the filter (by http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/package-summary.html) is

A filter that will only return the key component of each KV (the value will be rewritten as empty).

in order to restrict the keys on a specific range use the Scan(rowStart, rowEnd) constructor.

dminer · Answer

I would create a column family called 'empty:', and store empty values for all the rows. Now, you can just just request to load the column 'empty:'. This is not ideal, but it is better than loading columns families with lot of data.

Get HBase Row Keys in Range without Retrieving Data?

Tags:

hbase

Foxichu

3 Answers

Steve DeNeefe

divadpoc

dminer

Recent Activity

Donate For Us

Get HBase Row Keys in Range without Retrieving Data?

Tags:

hbase

Foxichu

3 Answers

Steve DeNeefe

divadpoc

dminer

Related questions

Recent Activity

Donate For Us