In Cassandra, I'm using the cql:
select msg from log where id in ('A', 'B') and filter1 = 'filter'
(where id
is the partition key and filter1
is a secondary index and filter1
cannot be used as a cluster column)
This gives the response:
Select on indexed columns and with IN clause for the PRIMARY KEY are not supported
How can I change CQL to prevent this?
Using the SELECT command with the IN keyword. The IN keyword can define a set of clustering columns to fetch together, supporting a "multi-get" of CQL rows. A single clustering column can be defined if all preceding columns are defined for either equality or group inclusion.
4. Secondary Indexes. Secondary Indexes in Cassandra solve the need for querying columns that are not part of the primary key. When we insert data, Cassandra uses an append-only file called commitlog for storing the changes, so writes are quick.
You can't create a table in Cassandra without a primary key, But still if you want to save your data you can add an additional column to your table (let say "pk") with data type UUID.
CQL does not support wildcard queries. CQL does not support Union, Intersection queries. Table columns cannot be filtered without creating the index. Greater than (>) and less than (<) query is only supported on clustering column.
You would need to split that up into separate queries of:
select msg from log where id = 'A' and filter1 = 'filter';
and
select msg from log where id = 'B' and filter1 = 'filter';
Due to the way data is partitioned in Cassandra, CQL has a lot of seemingly arbitrary restrictions (to discourage inefficient queries and also because they are complex to implement).
Over time I think these restrictions will slowly be removed, but for now we have to work around them. For more details on the restrictions, see A deep look at the CQL where clause.
Another option, is that you could build a table specifically for this query (a query table) with filter1
as a partition key and id
as a clustering key. That way, your query works and you avoid having a secondary index all-together.
aploetz@cqlsh:stackoverflow> CREATE TABLE log
(filter1 text,
id text,
msg text,
PRIMARY KEY (filter1, id));
aploetz@cqlsh:stackoverflow> INSERT INTO log (filter1, id, msg)
VALUES ('filter','A','message A');
aploetz@cqlsh:stackoverflow> INSERT INTO log (filter1, id, msg)
VALUES ('filter','B','message B');
aploetz@cqlsh:stackoverflow> INSERT INTO log (filter1, id, msg)
VALUES ('filter','C','message C');
aploetz@cqlsh:stackoverflow> SELECT msg FROM log
WHERE filter1='filter' AND id IN ('A','B');
msg
-----------
message A
message B
(2 rows)
You would still be using an "IN" which isn't known to perform well either. But you would also be specifying a partition key, so it might perform better than expected.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With