I want to verify that rows are getting added to the table. What cql statement would show the last n rows from the table below?
Table description below:
cqlsh:timeseries> describe table option_data;
CREATE TABLE option_data (
ts bigint,
id text,
strike decimal,
callask decimal,
callbid decimal,
maturity timestamp,
putask decimal,
putbid decimal,
PRIMARY KEY ((ts), id, strike)
) WITH
bloom_filter_fp_chance=0.010000 AND
caching='KEYS_ONLY' AND
comment='' AND
dclocal_read_repair_chance=0.100000 AND
gc_grace_seconds=864000 AND
index_interval=128 AND
read_repair_chance=0.000000 AND
replicate_on_write='true' AND
populate_io_cache_on_flush='false' AND
default_time_to_live=0 AND
speculative_retry='99.0PERCENTILE' AND
memtable_flush_period_in_ms=0 AND
compaction={'class': 'SizeTieredCompactionStrategy'} AND
compression={'sstable_compression': 'LZ4Compressor'};
cqlsh:timeseries>
Use the DISTINCT keyword to return only distinct (different) values of partition keys. The FROM clause specifies the table to query. You may want to precede the table name with the name of the keyspace followed by a period (.). If you do not specify a keyspace, Cassandra queries the current keyspace.
The range query algorithm is implemented in Apache Cassandra to deal with range and secondary index queries. As shown through this post, the algorithm determines an attribute called concurrency factor with the goal to limit the number of nodes queried to retrieve the results.
default_time_to_live. 0. Set this property in MapReduce scenarios when you have no control of TTL. The value of this property is a number of seconds. If it is set, Cassandra applies a default TTL marker to each column in the table, set to this value.
You didn't specify last n "by what".
To get the last N per id:
SELECT * FROM option_data WHERE ts=1 ORDER BY id DESC LIMIT N;
ORDER BY clause can only be applied to the second column in a compound primary key. If you need to query by time you will need to think about your data model a little more.
If your queries are most often "last N", you might consider writing something like this:
CREATE TABLE time_series (
id text,
t timeuuid,
data text,
PRIMARY KEY (id, t)
) WITH CLUSTERING ORDER BY (t DESC)
... where 'id' is your time series id. The CLUSTERING ORDER reverses the order of timeuuid 't', causing the cells to be stored in a natural order for your query.
With this, you would get the last five events as follows:
SELECT * FROM time_series WHERE id='stream id' LIMIT 5;
There is a lot of information out there for time series in Cassandra. I suggest reading some of the more recent articles on the matter.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With