I want to <code>SELECT DISTINCT</code> in Cassandra, but I think Cassandra doesn't support these operations. How can I do <code>SELECT DISTINCT</code> in Cassandra? Is it possible?

As others wrote, there is no support for distinct on columns in Cassandra, only on primary key. Two options to do this with cassandra: <ol> <li>Process in application - requires loading reading entire table from server and doing the distinct in code.</li> <li>Create a secondary table in which the key is the column you want to be distinct, and double insert there when doing an operation on original table.</li> </ol> The decision between the two depends on your actual data structure/size and needs, if the table size is small or you do this operation very little, option 1 will be enough and fast, if the table is large, and/or you do this query a lot of times go with #2.

How to SELECT DISTINCT in cassandra

2 Answers

CQL 3.1.1 and onwards support DISTINCT operation only for partition keys.

SELECT statement now allows listing the partition keys (using the DISTINCT modifier). See CASSANDRA-4536.

Select Syntax

select_statement ::=  SELECT [ JSON | DISTINCT ] ( select_clause | '*' )
                      FROM table_name
                      [ WHERE where_clause ]
                      [ GROUP BY group_by_clause ]
                      [ ORDER BY ordering_clause ]
                      [ PER PARTITION LIMIT (integer | bind_marker) ]
                      [ LIMIT (integer | bind_marker) ]
                      [ ALLOW FILTERING ]
select_clause    ::=  selector [ AS identifier ] ( ',' selector [ AS identifier ] )
selector         ::=  column_name
                      | term
                      | CAST '(' selector AS cql_type ')'
                      | function_name '(' [ selector ( ',' selector )* ] ')'
                      | COUNT '(' '*' ')'
where_clause     ::=  relation ( AND relation )*
relation         ::=  column_name operator term
                      '(' column_name ( ',' column_name )* ')' operator tuple_literal
                      TOKEN '(' column_name ( ',' column_name )* ')' operator term
operator         ::=  '=' | '<' | '>' | '<=' | '>=' | '!=' | IN | CONTAINS | CONTAINS KEY
group_by_clause  ::=  column_name ( ',' column_name )*
ordering_clause  ::=  column_name [ ASC | DESC ] ( ',' column_name [ ASC | DESC ] )*

107

answered Sep 28 '22 12:09

Babar

As others wrote, there is no support for distinct on columns in Cassandra, only on primary key. Two options to do this with cassandra:

Process in application - requires loading reading entire table from server and doing the distinct in code.
Create a secondary table in which the key is the column you want to be distinct, and double insert there when doing an operation on original table.

The decision between the two depends on your actual data structure/size and needs, if the table size is small or you do this operation very little, option 1 will be enough and fast, if the table is large, and/or you do this query a lot of times go with #2.

answered Sep 28 '22 12:09

Moshe Eshel

Related questions
                            
                                OR conflict between other conditions
                            
                                alias all column in a query with a prefix
                            
                                Slow query caused by parameter variables, but why?
                            
                                why doesn't the jquery change event fire when i use the up or down arrows on a select?
                            
                                HTML select element onchange trigger for already selected option
                            
                                Hibernate @DynamicUpdate(value=true) @SelectBeforeUpdate(value=true) performance
                            
                                Counting number of SELECTED rows in Oracle with PHP
                            
                                With a single file descriptor, Is there any performance difference between select, poll and epoll and ...?
                            
                                MySQL Subquery with main query data variable
                            
                                MySQL Injection - Use SELECT query to UPDATE/DELETE
                            
                                Select Case with "Is" operator
                            
                                How to call a stored procedure using select statement in mysql
                            
                                How to select a list of rows by name in Pandas dataframe
                            
                                Multiple Selection QTreeWidget
                            
                                Jquery select change
                            
                                SQL statement equivalent to ternary operator
                            
                                HTML Select + limit number of options visible
                            
                                Select TOP X (or bottom) percent for numeric values in MySQL
                            
                                how to work with Sql table named Group
                            
                                How can I make an HTML multiple select act like the control button is held down always

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to SELECT DISTINCT in cassandra

Tags:

select

distinct

cassandra

Anse danesh

People also ask

2 Answers

Babar

Moshe Eshel

Recent Activity

Donate For Us