Are secondary indices always a bad idea in Cassandra even if I specify them in conjunction with the partitioning key in all my queries?

Question

I know that secondary indices in Cassandra are generally a bad idea because the index is stored locally in each node i.e. not distributed across the cluster which may result in a query scanning a huge number of nodes. However, I don't understand why they are still a bad idea if I always specify the partition key in my queries and only use the secondary index as a final filter. I've read that they don't scale with large amounts of data even if I specify the partition key. Is this true? and if it's then why?

Saifallah KETBI · Accepted Answer

In general secondary indexes are bad idea, not only for the distributed part, but also for the index size and the number of distinct value, so if you have a field with high or low cardinality,you will be spending time on scanning many rows or many columns. Also you can have other issue while dealing with tombstones ...

To answer your question, secondary index in Cassandra doesn't scale that good, but if you use a partition key and by it you tell Cassandra which node have the data, it perform really better ! you can find more details here in section F :

https://www.datastax.com/blog/2016/04/cassandra-native-secondary-index-deep-dive

I hope this helps !

Are secondary indices always a bad idea in Cassandra even if I specify them in conjunction with the partitioning key in all my queries?

Tags:

cassandra

Islam Hassan

1 Answers

Saifallah KETBI

Recent Activity

Donate For Us

Are secondary indices always a bad idea in Cassandra even if I specify them in conjunction with the partitioning key in all my queries?

Tags:

cassandra

Islam Hassan

1 Answers

Saifallah KETBI

Related questions

Recent Activity

Donate For Us