Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to do polling in cassandra?

I'm trying to find a way to do polling over a cassandra database, but I'm new at this and I don't know how.

Lets say I have a table "users" like this

-> users
    -> user_name
    -> gender
    -> state

and I want to do polling constantly so I know if a new user was added to the table. How can I do that?

like image 380
JerePfluger Avatar asked Nov 10 '22 15:11

JerePfluger


1 Answers

The standard approach in a relational DB would involve doing a SELECT, ordering by some time-related ID descending, so that the newest row would always be returned first, so you could see if that matched your last 'newest row' and identify change - in cassandra, that won't work, because without a WHERE clause, the results are ordered by the partition's token, which is (almost certainly) random.

The solution, then, is to create a table that has a partition, where users are sorted within a given partition. For example:

CREATE TABLE user_buckets (
    bucket text,
    user_timestamp timeuuid,
    user_username text,
    PRIMARY KEY(bucket, user_timestamp)
) WITH CLUSTERING ORDER BY (user_timestamp DESC);

In this case, you would write into both the users table and the user_buckets table, with 'bucket' being something reasonable (such as date(YYYY) - where each partition contains all of the users registering in that year, or date(YYYYMMDD) - where each partition contains all of the users registering in that day), and then use SELECT ... FROM user_buckets WHERE bucket=(current-bucket) AND user_timestamp > (last timestamp you've seen).

like image 63
Jeff Jirsa Avatar answered Nov 15 '22 12:11

Jeff Jirsa