I recently started using Cassandra - I come from a traditional relational database background, so it's definitely a bit different. One thing I'm used to always doing is generating a unique ID for each row (OID, etc.). So, for my tables that I've been creating in Cassandra I've been putting a UUID column on each of them and generating a UUID. My question is...is this really "necessary"? I'm not using the UUID as part of my partition key, so I'm not really using it for anything at the moment, but it's a tough habit to break. Some advice would be great!
Exactly it's not necessary. But introducing a UUID in a table may be useful in certain cases. For example imagine you have a table like :
CREATE TABLE user (
id uuid,
name text,
login text,
day_of_birth date
) PRIMARY KEY (login);
This table allows you to query users by login.
Now imagine you also want to query users by name.
Of course if this kind of query will be run just a few time, you can create a SECONDARY INDEX.
But if you want to have good read performance, you can denormalize your data by having a table structure like :
CREATE TABLE user (
id uuid,
name text,
login text,
day_of_birth date
) PRIMARY KEY (id);
CREATE TABLE user_by_name (
id uuid,
name text
) PRIMARY KEY (name);
CREATE TABLE user_by_login (
id uuid,
login text
) PRIMARY KEY (login);
But with this structure, you have to insert and update in all 3 tables to maintain data. Instead of creating two other tables you can use MATERIALIZED VIEW to maintain only one table an let cassandra maintain view:
CREATE TABLE user (
id uuid,
name text,
login text,
day_of_birth date
) PRIMARY KEY (id);
CREATE MATERIALIZED VIEW user_by_name
AS
SELECT *
FROM user
WHERE id IS NOT NULL
AND name IS NOT NULL
PRIMARY KEY ((name), id);
CREATE MATERIALIZED VIEW user_by_login
AS
SELECT *
FROM user
WHERE id IS NOT NULL
AND login IS NOT NULL
PRIMARY KEY ((login), id);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With