Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bad Request: No indexed columns present in by-columns clause with Equal operator : CQL error?

I have below table in CQL-

create table test (
    employee_id text,
    employee_name text,
    value text,
    last_modified_date timeuuid,
    primary key (employee_id)
   );

I inserted couple of records in the above table like this which I will be inserting in our actual use case scenario-

insert into test (employee_id, employee_name, value, last_modified_date) values ('1', 'e27',  'some_value', now());
insert into test (employee_id, employee_name, value, last_modified_date) values ('2', 'e27',  'some_new_value', now());
insert into test (employee_id, employee_name, value, last_modified_date) values ('3', 'e27',  'some_again_value', now());
insert into test (employee_id, employee_name, value, last_modified_date) values ('4', 'e28',  'some_values', now());
insert into test (employee_id, employee_name, value, last_modified_date) values ('5', 'e28',  'some_new_values', now());

Now I was doing select query for - give me all the employee_id for employee_name e27.

select employee_id from test where employee_name = 'e27';

And this is the error I am getting -

Bad Request: No indexed columns present in by-columns clause with Equal operator
Perhaps you meant to use CQL 2? Try using the -2 option when starting cqlsh.

Is there anything wrong I am doing here?

My use cases are in general -

  1. Give me everything for any of the employee_name?
  2. Give me everything for what has changed in last 5 minutes?
  3. Give me the latest employee_id and value for any of the employee_name?
  4. Give me all the employee_id for any of the employee_name?

I am running Cassandra 1.2.11

like image 522
AKIWEB Avatar asked Nov 03 '13 18:11

AKIWEB


2 Answers

The general rule is simple: "you can only query by columns that are part of the key". As an explanation all other queries would require a complete scan of the tables which might mean a lot of data sifting.

There are things that can modify this rule:

  1. use secondary indexes for columns with low cardinality (more details here)
  2. define multi-column keys (e.g. PRIMARY KEY (col1, col2); which would allow queries like col1 = value1 and col1 = value1 and col2 COND)
  3. use ALLOW FILTERING in queries. This will result in a warning as Cassandra will have to sift through a lot of data and there will be no performance guarantees. For more details see details of ALLOW FILTERING in CQL and this SO thread
like image 160
Alex Popescu Avatar answered Oct 24 '22 05:10

Alex Popescu


Cassandra take a little getting used to :) Some of us have been spoiled by some of the extra stuff RDBMS does for you that you do not get for free from noSql.

If you think back on a regular RDBMS table, if you SELECT on a column that has no index, the DB must do a full-table scan to find all the matches you seek. This is a no-no in Cassandra, and it will complain if you try to do this. Imagine if you found 10^32 matches to this query? It is not a reasonable ask.

In your table, you have coded *PRIMARY KEY(employee_id);* this is the row's primary and unique identifying key. You can now SELECT * from TEST where employee_id='123'; this is perfectly reasonable and Cassandra will happily return the result.

However, your SELECT from TEST WHERE employee_name = 'e27'; tells Cassandra to go and read EVERY record until it finds a match on 'e27'. With no index to rely on, it politely asks you to 'forget it'.

If you want to filter on a column, make sure you have an index on that column so that Cassandra can performs the filtering you need.

like image 16
hughj Avatar answered Oct 24 '22 04:10

hughj