From the table schema below, I am trying to select all pH readings that are below 5.
I have followed these three pieces of advice:
Here is my query:
select * from todmorden_numeric where sensor_name = 'pHradio' and reading_value < 5 allow filtering;
Which is rejected with this message:
Bad Request: No indexed columns present in by-columns clause with Equal operator
I tried adding a secondary index to the sensor_name column and was told that it was already part of the key and therefore already indexed.
I created the index after the table had been in use for a while - could that be the problem? I ran "nodetool refresh" in the hope it would make the index available but this did not work. Here is the output of describe table todmorden_numeric
:
CREATE TABLE todmorden_numeric (
sensor_name text,
reading_time timestamp,
reading_value float,
PRIMARY KEY ((sensor_name), reading_time)
) WITH
bloom_filter_fp_chance=0.010000 AND
caching='KEYS_ONLY' AND
comment='Data that suits being stored as floats' AND
dclocal_read_repair_chance=0.000000 AND
gc_grace_seconds=864000 AND
index_interval=128 AND
read_repair_chance=0.100000 AND
replicate_on_write='true' AND
populate_io_cache_on_flush='false' AND
default_time_to_live=0 AND
speculative_retry='99.0PERCENTILE' AND
memtable_flush_period_in_ms=0 AND
compaction={'class': 'SizeTieredCompactionStrategy'} AND
compression={'sstable_compression': 'LZ4Compressor'};
CREATE INDEX todmorden_numeric_reading_value_idx ON todmorden_numeric (reading_value);
Cassandra allows range search only on:
a) Partition Key only if ByteOrderPartitioner is used (default now is murmur3).
b) any single clustering key ONLY IF any clustering keys defined BEFORE the target column in the primary key definition are already specified by an = operator in the predicate.
They don't work on secondary indices.
Consider the following table definition:
CREATE TABLE tod1 (name text, time timestamp,
val float, PRIMARY KEY (name, time));
You CAN'T do a range on the val in this case.
Consider this one:
CREATE TABLE tod2 (name text, time timestamp,
val float, PRIMARY KEY (name, time, val));
Then the following is valid:
SELECT * FROM tod2 WHERE name='X' AND time='timehere' AND val < 5;
Kinda pointless, but this is not valid:
SELECT * from tod2 WHERE name='X' AND val < 5;
It's not valid as you haven't filtered by a previous clustering key in the primary key def (in this case, time).
For your query, you may want to do this:
CREATE TABLE tod3 (name text, time timestamp,
val float, PRIMARY KEY (name, val, time));
Note the order of columns in the primary key: val's before time.
This will allow you to do:
SELECT * from tod3 WHERE name='asd' AND val < 5;
On a different note, how long do you intend to hold data? How frequently do you get readings? This can cause your partition to grow quite large quite quickly. You may want to bucket it readings into multiple partitions (manual sharding). Perhaps one partition per day? Of course, such things would greatly depend on your access patterns.
Hope that helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With