Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to query Cassandra by date range

I have a Cassandra ColumnFamily (0.6.4) that will have new entries from users. I'd like to query Cassandra for those new entries so that I can process that data in another system.

My sense was that I could use a TimeUUIDType as the key for my entry, and then query on a KeyRange that starts either with "" as the startKey, or whatever the lastStartKey was. Is this the correct method?

How does get_range_slice actually create a range? Doesn't it have to know the data type of the key? There's no declaration of the data type of the key anywhere. In the storage_conf.xml file, you declare the type of the columns, but not of the keys. Is the key assumed to be of the same type as the columns? Or does it do some magic sniffing to guess?

I've also seen reference implementations where people store TimeUUIDType in columns. However, this seems to have scale issues as this particular key would then become "hot" since every change would have to update it.

Any pointers in this case would be appreciated.

like image 513
Doug Avatar asked Aug 20 '10 21:08

Doug


People also ask

Does Cassandra support range queries?

No Range Queries in Cassandra empId here is a token, for us, it's a number, but in Cassandra is a Hashed Value. In my opinion, Cassandra is a fantastic Database, extremely good in performance and writes are free here, but before designing the tables we should know our queries beforehand for better performance.

How do I order by Cassandra?

You can fine-tune the display order using the ORDER BY clause. The partition key must be defined in the WHERE clause and the ORDER BY clause defines the clustering column to use for ordering. cqlsh> CREATE TABLE cycling.


1 Answers

When sorting data only the column-keys are important. The data stored is of no consequence neither is the auto-generated timestamp. The CompareWith attribute is important here. If you set CompareWith as UTF8Type then the keys will be interpreted as UTF8Types. If you set the CompareWith as TimeUUIDType then the keys are automatically interpreted as timestamps. You do not have to specify the data type. Look at the SlicePredicate and SliceRange definitions on this page http://wiki.apache.org/cassandra/API This is a good place to start. Also, you might find this article useful http://www.sodeso.nl/?p=80 In the third part or so he talks about slice ranging his queries and so on.

like image 139
Sagar V Avatar answered Oct 26 '22 10:10

Sagar V