What is the canonical way to model many-to-many relations with CQL3 ? Let's say I have to tables
CREATE TABLE actor (
id text PRIMARY KEY,
given text,
surname text,
)
CREATE TABLE fan (
id text PRIMARY KEY,
given text,
surname text,
)
and I'd like to model the fact that an actor can have many fan and each fan can like many actors.
The first idea that came to my my was to use sets, like in the following (and the other way around for fans):
CREATE TABLE actor (
id text PRIMARY KEY,
given text,
surname text,
fans set<text>
)
<similarly for fan>
but it seems they are meant for small sets, and I don't see a way to check if a fan is related to an actor without loading either set completely.
The second choice I found would be to make two mapping tables, each for each relation direction:
CREATE TABLE actor_fan (
text actor,
text fan,
PRIMARY KEY(actor,fan)
);
<similarly for fan_actor>
Would this give me the ability to get both the fan list of an actor and check if a specific person is a fan of a given actor ? There is a lot of documentation about Cassandra, but it is often related to older versions and there seem to be lot of differences between the releases.
Handling One to One Relationship in CassandraOne to one relationship means two tables have one to one correspondence. For example, the student can register only one course, and I want to search on a student that in which course a particular student is registered in.
No joins. You cannot perform joins in Cassandra. If you have designed a data model and find that you need something like a join, you'll have to either do the work on the client side, or create a denormalized second table that represents the join results for you.
Apache Cassandra does not have the concept of foreign keys or relational integrity. Apache Cassandra's data model is based around designing efficient queries; queries that don't involve multiple tables. Relational databases normalize data to avoid duplication.
Tables are the entities of a database. Tables or column families are the entity of a keyspace. Row is an individual record in RDBMS.
The proper way to do this in Cassandra is denormalizing the data into 2 tables. You shouldn't worry about having to write twice, once on each table, as Cassandra is designed to handle writes very fast to support such model.
Take a look at this data modelling tutorials that will help understanding these things:
Data modelling tutorials
Also I see you mentioned sets as well. Just as a side note and although it is not an answer to your questions, you might want to be aware of some new features like: http://www.datastax.com/dev/blog/cql-in-2-1
The way to achieve it is denormalizing data creating an actors_by_fans
and a fans_by_actors
. You can also use sets but this have limitations you already mentioned.
HTH, Carlo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With