Finding distinct values of non Primary Key column in CQL Cassandra

Tags:

I use the following code for creating table:

CREATE KEYSPACE mykeyspace
WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
USE mykeyspace;
CREATE TABLE users (
  user_id int PRIMARY KEY,
  fname text,
  lname text
);
INSERT INTO users (user_id,  fname, lname)
  VALUES (1745, 'john', 'smith');
INSERT INTO users (user_id,  fname, lname)
  VALUES (1744, 'john', 'doe');
INSERT INTO users (user_id,  fname, lname)
  VALUES (1746, 'john', 'smith');

I would like to find the distinct value of lname column (that is not a PRIMARY KEY). I would like to get the following result:

 lname
-------
 smith

By using SELECT DISTINCT lname FROM users; However since lname is not a PRIMARY KEY I get the following error:

InvalidRequest: code=2200 [Invalid query] message="SELECT DISTINCT queries must
only request partition key columns and/or static columns (not lname)"
cqlsh:mykeyspace> SELECT DISTINCT lname FROM users;

How can I get the distinct values from lname?

490

asked Mar 07 '16 09:03

Avi

2 Answers

User - Undefined_variable - makes two good points:

In Cassandra, you need to build your data model to match your query patterns. This sometimes means duplicating your data into additional tables, to attain the desired level of query flexibility.
DISTINCT only works on partition keys.

So, one way to get this to work, would be to build a specific table to support that query:

CREATE TABLE users_by_lname (
    lname text,
    fname text,
    user_id int,
    PRIMARY KEY (lname, fname, user_id)
);

Now after I run your INSERTs to this new query table, this works:

aploetz@cqlsh:stackoverflow> SELECT DISTINCT lname FROm users_by_lname ;

 lname
-------
 smith
   doe

(2 rows)

Notes: In this table, all rows with the same partition key (lname) will be sorted by fname, as fname is a clustering key. I added user_id as an additional clustering key, just to ensure uniqueness.

answered Oct 30 '22 10:10

Aaron

There is no such functionality in cassandra. DISTINCT is possible on partition key only. You should Design Your data model based on your requirements. You have to process the data in application logic (spark may be useful)

answered Oct 30 '22 11:10

undefined_variable

Related questions
                            
                                Mysql get all records described in in condition even if it does not exists in table
                            
                                select option generate value from ajax on change
                            
                                Remove leading zeros from %m and %d in MySQL date_format
                            
                                SQL: many-to-many relationship, IN condition
                            
                                Jquery Uniform change select size
                            
                                How to use SELECT DISTINCT with RANDOM() function in PostgreSQL?
                            
                                jQuery select change event when selecting the same value
                            
                                SQL Alternative for 'OR' in where clause when using outer join
                            
                                Displaying ngModel value for mat-select
                            
                                oracle improve query performance
                            
                                Select distinct and non-distinct column values
                            
                                Last option in Drop-down is not getting hover effect in Google Chrome 32.0.1700.76 m
                            
                                add extra columns in a SELECT INTO statement
                            
                                POST in a form Jquery select2
                            
                                Select prompt option disappears when validation fails in Rails
                            
                                SQL get last rows in table WITHOUT primary ID
                            
                                How to make a JTable row to an "unselected" state after selected any row in that table?
                            
                                text vertical align in select box - firefox issue
                            
                                Mysql query case with range, and case with currency
                            
                                Angular ui-select tagging on array of objects

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Finding distinct values of non Primary Key column in CQL Cassandra

Tags:

select

distinct

cassandra

cql

cql3

Avi

People also ask

2 Answers

Aaron

undefined_variable

Recent Activity

Donate For Us