Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Double range query in Cassandra

Tags:

cassandra

cql

I would like to perform a double range query for getting latitude and longitude points near one point,

in Cassandra it seems possible now, I just tried

create column family users
 with comparator=UTF8Type
 AND key_validation_class=UTF8Type
 and column_metadata=[{column_name: full_name, validation_class: UTF8Type},
 {column_name: type, validation_class: UTF8Type, index_type: KEYS},
 {column_name: lat, validation_class: LongType, index_type: KEYS},
 {column_name: lon, validation_class:  LongType, index_type: KEYS}];

SET users['a']['type']='test';                                             
SET users['b']['type']='test';
SET users['c']['type']='test';
SET users['a']['lat']='12';                                                
SET users['b']['lat']='9'; 
SET users['c']['lat']='12';
SET users['b']['lon']='1'; 
SET users['a']['lon']='4';
SET users['c']['lon']='2';
get users where type = 'test' and lon < '6' and lon > '3' and lat > '10' and lat < '13';

RowKey: a => (column=lat, value=12, timestamp=1336339056413000) => (column=lon, value=4, timestamp=1336339088170000) => (column=type, value=test, timestamp=1336339033765000)

1 Row Returned.

But I'm quite worried about performances when adding thousands of points, If those 3 columns are indexed.

1) I had to use the 'type' column indexed, because without it, the query fails

No indexed columns present in index clause with operator EQ

is it possible to bypass it?

2) It could be interesting to naturally sort all the data by lat or lon, and then just query on the other one,

So just doing a SliceQuery for the lat between x and y followed by a query

get users where type = 'test' and lon < '6' and lon > '3';

To order the CF not by rows names but by another field (ex: a String lat+lon and a UTF8 comparator) how can this be done?

thanks


1 Answers

Your solution may work on smaller dataset. Once it grows you need some Spatial index to perform fast lookups. Cassandra does not support spatial indexes as for now. I would suggest you look at GeoCell / GeoHash

You create hash for each Point coordinate and then you can perform range queries over the string. In this case Cassandra Range Queries would be a good option.

GeoHash is a hierarchical spatial data structure which subdivides space into buckets of grid shape.

Links:

  • geohashing
  • Wikipedia: http://en.wikipedia.org/wiki/Geohash
  • Java Implementation http://code.google.com/p/javageomodel/
like image 134
vladaman Avatar answered May 22 '26 07:05

vladaman



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!