Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Range Queries in Cassandra (CQL 3.0)

One main part of Cassandra that I don't fully understand is its range queries. I know that Cassandra emphasizes distributed environment and focuses on performance, but probably because of that, it currently only support several types of ranges queries that it can finish efficiently, and what I would like to know is that: which types of range queries are supported by Cassandra.

As far as I know, Cassandra supports the following range queries:

1: Range Queries on Primary key with keyword TOKEN, for example:

 CREATE TABLE only_int (int_key int PRIMARY KEY);
 ...
 select * from only_int where token(int_key) > 500;

2: Range Queries with one equality condition on a secondary index with keyword ALLOW FILTERING, for example:

CREATE TABLE example (
  int_key int PRIMARY KEY,
  int_non_key int,
  str_2nd_idx ascii
);
CREATE INDEX so_example_str_2nd_idx ON example (str_2nd_idx);
...
select * from example where str_2nd_idx = 'hello' and int_non_key < 5 allow filtering;

But I am wondering if I miss something and looking for a canonical answer which lists all types of range queries supported by the current CQL (or some work-around that allows more types of range queries).

like image 401
keelar Avatar asked Aug 13 '13 07:08

keelar


People also ask

Does Cassandra support range queries?

No Range Queries in Cassandra empId here is a token, for us, it's a number, but in Cassandra is a Hashed Value. In my opinion, Cassandra is a fantastic Database, extremely good in performance and writes are free here, but before designing the tables we should know our queries beforehand for better performance.

How do I limit the number of rows in Cassandra?

The LIMIT option sets the maximum number of rows that the query returns: SELECT lastname FROM cycling. cyclist_name LIMIT 50000; Even if the query matches 105,291 rows, Cassandra only returns the first 50,000.

How do I SELECT distinct rows in Cassandra?

In cassandra you can only select the distinct records from Partition Key column or columns. If Partition key consists of multiple columns, you have to provide all of the columns otherwise you will get an error.

How do you write a nested query in Cassandra?

Nested queries are not allowed in Cassandra CQL. For this kind of complex querying feature you'll need to use Hive or SparkSQL.


2 Answers

You can look for clustering keys. A primary key can be formed by a partitioning key and then by clustering keys.

for example definition like this one

CREATE TABLE example (
  int_key int,
  int_non_key int,
  str_2nd_idx ascii,
  PRIMARY KEY((int_key), str_2nd_idx)
);

will allow to you make queries like these without using token

select * from example where str_2nd_idx < 'hello' allow filtering;

Before creating a TABLE in cassandra you should start from thinking about queries and what you want to ask from the data model in cassandra.

like image 79
Zain Malik Avatar answered Sep 30 '22 01:09

Zain Malik


Apart from the queries you mentioned, you can also have queries on "Composite Key" column families (well you need to design your DB using composite keys, if that fits your constrains). For an example/discussion on this take a look at Query using composite keys, other than Row Key in Cassandra. When using Composite Keys you can perform other types of queries, namely "range" queries that do not use the "partition key" (first element of the composite key) - normally you need to set the "allow filtering" parameter to allow these queries, and also can perform "order by" operations on those elements, which can be very interesting in many situations. I do think that composite key column families allow to overcome several (necessary) "limitations" (to grant performance) of the cassandra data model when compared with the "extremely flexible" (but slow) model of RDBMS...

like image 34
emgsilva Avatar answered Sep 30 '22 01:09

emgsilva