Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are secondary indices always a bad idea in Cassandra even if I specify them in conjunction with the partitioning key in all my queries?

Tags:

cassandra

I know that secondary indices in Cassandra are generally a bad idea because the index is stored locally in each node i.e. not distributed across the cluster which may result in a query scanning a huge number of nodes. However, I don't understand why they are still a bad idea if I always specify the partition key in my queries and only use the secondary index as a final filter. I've read that they don't scale with large amounts of data even if I specify the partition key. Is this true? and if it's then why?

like image 967
Islam Hassan Avatar asked Nov 02 '25 07:11

Islam Hassan


1 Answers

In general secondary indexes are bad idea, not only for the distributed part, but also for the index size and the number of distinct value, so if you have a field with high or low cardinality,you will be spending time on scanning many rows or many columns. Also you can have other issue while dealing with tombstones ...

To answer your question, secondary index in Cassandra doesn't scale that good, but if you use a partition key and by it you tell Cassandra which node have the data, it perform really better ! you can find more details here in section F :

https://www.datastax.com/blog/2016/04/cassandra-native-secondary-index-deep-dive

I hope this helps !

like image 190
Saifallah KETBI Avatar answered Nov 04 '25 04:11

Saifallah KETBI



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!