Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cassandra cql select sorting

If I define a table like this using cql:

CREATE TABLE scores (
 name text,
 age int,
 score int,
 date timestamp,
 PRIMARY KEY (name, age, score)
);

And do a SELECT in cqlsh like this:

select * from mykeyspace.scores;

The result displayed seems to always be sorted by 'age', then 'score' automatically in ascending order regardless of input-data ordering (as expected, return rows are not sorted by the partition key 'name'). I have the following questions:

  1. Does SELECT automatically sort return rows by the clustering keys?
  2. If yes, what's the purpose of using the ORDER BY clause when using SELECT?
  3. If no, how do I get the return rows to sort by the clustering keys since cql doesn't allow ORDER BY on a select *?
like image 581
ptmoy2 Avatar asked Feb 22 '14 02:02

ptmoy2


People also ask

How do I sort data in Cassandra?

Cassandra supports sorting using the clustering columns. When you create a table, you can define clustering columns which will be used to sort the data inside each partition in either ascending or descending orders. Then you can easily use the ORDER BY clause with the ASC or DESC options.

How do you ORDER BY CQL?

You can fine-tune the display order using the ORDER BY clause. The partition key must be defined in the WHERE clause and the ORDER BY clause defines the clustering column to use for ordering. cqlsh> CREATE TABLE cycling.

How do I SELECT distinct rows in Cassandra?

In cassandra you can only select the distinct records from Partition Key column or columns. If Partition key consists of multiple columns, you have to provide all of the columns otherwise you will get an error.

Does Cassandra Sort by partition key?

In this article, we learned that Cassandra uses a partition key or a composite partition key to determine the placement of the data in a cluster. The clustering key provides the sort order of the data stored within a partition. All of these keys also uniquely identify the data.


1 Answers

Your clustering columns define the order (in your case age then score)

http://cassandra.apache.org/doc/cql3/CQL.html#createTableStmt

On a given physical node, rows for a given partition key are stored in the order induced by the clustering columns, making the retrieval of rows in that clustering order particularly efficient (see SELECT).

http://cassandra.apache.org/doc/cql3/CQL.html#selectStmt

The ORDER BY option allows to select the order of the returned results. It takes as argument a list of column names along with the order for the column (ASC for ascendant and DESC for descendant, omitting the order being equivalent to ASC). Currently the possible orderings are limited (which depends on the table CLUSTERING ORDER):

  • if the table has been defined without any specific CLUSTERING ORDER, then the allowed orderings are the order induced by the clustering columns and the reverse of that one.
  • otherwise, the orderings allowed are the order of the CLUSTERING ORDER option and the reversed one.
like image 66
Mikhail Stepura Avatar answered Oct 02 '22 23:10

Mikhail Stepura