Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to filter out rows with given column(not null)?

I want to do a hbase scan with filters. For example, my table has column family A,B,C, and A has a column X. Some rows have the column X and some do not. How can I implement the filter to filter out all the rows with column X?

like image 836
user1573269 Avatar asked Oct 12 '12 12:10

user1573269


2 Answers

I guess you are looking for SingleColumnValueFilter in HBase. As mentioned in the API

To prevent the entire row from being emitted if the column is not found on a row, use setFilterIfMissing(boolean) on Filter object. Otherwise, if the column is found, the entire row will be emitted only if the value passes. If the value fails, the row will be filtered out.

But SingleColumnValueFilter would want a value to have Column X "CompareOp" to something, say bring this row if ColumnX == "X" or bring this row if ColumnX != "A sentinel value that ColumnX can never take" and setFilterIfMissing(true) so that if ColumnX has some value, it is returned.

I hope this nudges you in the right direction.

like image 69
ankitk Avatar answered Nov 15 '22 09:11

ankitk


You can use a SkipFilter along with ColumnPrefixFilter. The ColumnPrefixFilter gets keys where the column exists (an HBase row will only have a column if it has a value) the Skip filter will give you the "Not" on the first filter so the row will be omitted

like image 31
Arnon Rotem-Gal-Oz Avatar answered Nov 15 '22 10:11

Arnon Rotem-Gal-Oz