I would like to perform a double range query for getting latitude and longitude points near one point,
in Cassandra it seems possible now, I just tried
create column family users
with comparator=UTF8Type
AND key_validation_class=UTF8Type
and column_metadata=[{column_name: full_name, validation_class: UTF8Type},
{column_name: type, validation_class: UTF8Type, index_type: KEYS},
{column_name: lat, validation_class: LongType, index_type: KEYS},
{column_name: lon, validation_class: LongType, index_type: KEYS}];
SET users['a']['type']='test';
SET users['b']['type']='test';
SET users['c']['type']='test';
SET users['a']['lat']='12';
SET users['b']['lat']='9';
SET users['c']['lat']='12';
SET users['b']['lon']='1';
SET users['a']['lon']='4';
SET users['c']['lon']='2';
get users where type = 'test' and lon < '6' and lon > '3' and lat > '10' and lat < '13';
RowKey: a => (column=lat, value=12, timestamp=1336339056413000) => (column=lon, value=4, timestamp=1336339088170000) => (column=type, value=test, timestamp=1336339033765000)
1 Row Returned.
But I'm quite worried about performances when adding thousands of points, If those 3 columns are indexed.
1) I had to use the 'type' column indexed, because without it, the query fails
No indexed columns present in index clause with operator EQ
is it possible to bypass it?
2) It could be interesting to naturally sort all the data by lat or lon, and then just query on the other one,
So just doing a SliceQuery for the lat between x and y followed by a query
get users where type = 'test' and lon < '6' and lon > '3';
To order the CF not by rows names but by another field (ex: a String lat+lon and a UTF8 comparator) how can this be done?
thanks
Your solution may work on smaller dataset. Once it grows you need some Spatial index to perform fast lookups. Cassandra does not support spatial indexes as for now. I would suggest you look at GeoCell / GeoHash
You create hash for each Point coordinate and then you can perform range queries over the string. In this case Cassandra Range Queries would be a good option.
GeoHash is a hierarchical spatial data structure which subdivides space into buckets of grid shape.
Links:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With