how to avoid secondary indexes in cassandra?

Tags:

I have heard repeatedly that secondary indexes (in cassandra) is only for convenience but not for better performance. The only case where it is recommended to use secondary indexes when you have low cardinality (such as gender column which has two values male or female)

consider this example:

CREATE TABLE users ( 
userID uuid, 
firstname text, 
lastname text, 
state text, 
zip int, 
PRIMARY KEY (userID) 
);

right now I cannot do this query unless I create a secondary index on users on firstname index

select * from users where firstname='john'

How do I denormalize this table such that I can have this query: Is this the only efficient way by using composite keys? Any other alternatives or suggestions?

CREATE TABLE users ( 
    userID uuid, 
    firstname text, 
    lastname text, 
    state text, 
    zip int, 
    PRIMARY KEY (firstname,userID) 
    );

661

asked Aug 04 '14 18:08

brain storm

1 Answers

In order to come up with a good data model, you need to identify first ALL queries you would like to perform. If you only need to look up users by their firstname (or firstname and userID), then your second design is fine...

If you also need to look up users by their last name, then you could create another table having the same fields but a primary key on (lastname, userID). Obviously you will need to update both tables in the same time. Data duplication is fine in Cassandra.

Still, if you are concerned about the space needed for the two or more tables, you could create a single users table partitioned by user id, and additional tables for the fields you want to query by:

CREATE TABLE users ( 
    userID uuid, 
    firstname text, 
    lastname text, 
    state text, 
    zip int, 
    PRIMARY KEY (userID) 
);

CREATE TABLE users_by_firstname (
    firstname text,
    userid uuid,
    PRIMARY KEY (firstname, userid)
);

The disadvantage of this solution is that you will need two queries to retrieve users by their first name:

SELECT userid FROM users_by_firstname WHERE firstname = 'Joe';
SELECT * FROM users WHERE userid IN (...);

Hope this helps

answered Sep 19 '22 13:09

medvekoma

Related questions
                            
                                CQL3 Each row to have its own schema
                            
                                Iterating through Cassandra wide row with CQL3
                            
                                Cassandra CQL query check multiple values
                            
                                If Dynamic columns are discouraged in cassandra 1.2/Cql3 , then how is it better than Mysql in functionality?
                            
                                Inserting arbitrary columns in Cassandra using CQL3
                            
                                Cassandra Non-Counter Family
                            
                                Timestamp comparison in cassandra
                            
                                Error: unable to connect to cassandra server. Unconfigured table
                            
                                Alter cassandra column family primary key using cassandra-cli or CQL
                            
                                Cassandra selective copy
                            
                                Cassandra NOT EQUAL Operator
                            
                                how UPDATE rows in cassandra using only Partition Key?
                            
                                Cassandra cql: how to select the LAST n rows from a table
                            
                                Can I create a secondary index on multiple columns in cassandra
                            
                                Does collections in CQL3 have certain limits?
                            
                                Cassandra Java Driver: How are insert, update, and delete results reported?
                            
                                how to construct range query in cassandra?
                            
                                Apache Cassandra delete from counter
                            
                                Where and Order By Clauses in Cassandra CQL
                            
                                Prettifying results of cqlsh commands in Linux terminal

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

how to avoid secondary indexes in cassandra?

Tags:

secondary-indexes

cassandra-2.0

cql3

brain storm

People also ask

1 Answers

medvekoma

Recent Activity

Donate For Us