Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I create a secondary index on multiple columns in cassandra

Can I create a secondary index on multiple columns in cassandra? like can I do

create index my_composite_index on my_column_family (id,name)

CQL throws error

:2:Bad Request: line 1:73 mismatched input ',' expecting ')'

like image 371
user1955409 Avatar asked Nov 20 '13 15:11

user1955409


People also ask

Can an index be created on multiple columns?

An index can be defined on more than one column of a table. For example, if you have a table of this form: CREATE TABLE test2 ( major int, minor int, name varchar );

What will happen if you apply index on multiple column?

An index with more than one column aggregates the contents.

Does Cassandra support secondary index?

Secondary Indexes in Cassandra solve the need for querying columns that are not part of the primary key. When we insert data, Cassandra uses an append-only file called commitlog for storing the changes, so writes are quick.

How do I create a secondary index in Cassandra?

The CREATE INDEX statement is used to create a new (automatic) secondary index for a given (existing) column in a given table. A name for the index itself can be specified before the ON keyword, if desired. If data already exists for the column, it will be indexed asynchronously.


2 Answers

Alex's answer is correct but I thought I would add some additional input.

Cassandra secondary indexes (2i) are really meant for low-cardinality fields, i.e. things that are not unique to each entity / row.

If you have a table of 250 million US citizens, using a secondary index to track which state they're in is a perfect use case for 2i. Using a secondary index to track their social security number is not - it would create enormous performance problems for both reads and writes. You'd be better off creating your own index column family in the second scenario.

2i are not replicated and have to be created locally in each node, so there's a substantial amount of work involved in rebuilding them if you have to replace a node or add a new one.

Personally, I use 2i for filtering item results all sorted on the same CQL row (i.e. all items have the same partition key) - it's quite performant in that use case.

like image 130
Aaronontheweb Avatar answered Sep 18 '22 11:09

Aaronontheweb


No. Cassandra secondary indexes are based on a single column. You can find the CQL syntax for creating a secondary index here

As a bit of background, the reason secondary indexes are based on a single column is to avoid read-before-write in order to preserve the performance of writes.

like image 38
Alex Popescu Avatar answered Sep 16 '22 11:09

Alex Popescu