For secondary index queries that the partition key is specified in the WHERE clause, does the secondary index lookup hits all cluster nodes, or just the node of the specified partition key?
If the latter is correct, then secondary index will be a good fit also for high cardinality fields (only for queries that satisfies the partition key).
EDIT: For example, for the following feed schema, query of a specific feed (feed_id specified) to retrieve existing or deleted feed items should be very efficient:
CREATE TABLE my_feed (
feed_id int,
item_id timeuuid,
is_deleted boolean,
data text,
PRIMARY KEY (feed_id, item_id)
) WITH CLUSTERING ORDER BY (item_id DESC);
CREATE INDEX my_feed_is_deleted_idx ON my_feed (is_deleted);
==> SELECT * FROM my_feed WHERE feed_id=1 AND is_deleted=false; --efficient?
If you hit a partition key first, then it won't be a cluster wide operation. Only the target partition will be hit. If you have wide rows with many rows in a partition, a secondary index will be an efficient way to filter them down once a partition is hit.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With