Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Modeling many-to-many relations in Cassandra 2 with CQL3

What is the canonical way to model many-to-many relations with CQL3 ? Let's say I have to tables

CREATE TABLE actor (
    id text PRIMARY KEY,
    given text,
    surname text,
)

CREATE TABLE fan (
    id text PRIMARY KEY,
    given text,
    surname text,
)

and I'd like to model the fact that an actor can have many fan and each fan can like many actors.

The first idea that came to my my was to use sets, like in the following (and the other way around for fans):

CREATE TABLE actor (
    id text PRIMARY KEY,
    given text,
    surname text,
    fans set<text>
)

<similarly for fan>

but it seems they are meant for small sets, and I don't see a way to check if a fan is related to an actor without loading either set completely.

The second choice I found would be to make two mapping tables, each for each relation direction:

CREATE TABLE actor_fan (
    text actor,
    text fan,
    PRIMARY KEY(actor,fan)
);

<similarly for fan_actor>

Would this give me the ability to get both the fan list of an actor and check if a specific person is a fan of a given actor ? There is a lot of documentation about Cassandra, but it is often related to older versions and there seem to be lot of differences between the releases.

like image 831
Marco Righele Avatar asked Oct 26 '14 13:10

Marco Righele


People also ask

How the relationships are handled in Cassandra?

Handling One to One Relationship in CassandraOne to one relationship means two tables have one to one correspondence. For example, the student can register only one course, and I want to search on a student that in which course a particular student is registered in.

Can we join two tables in Cassandra?

No joins. You cannot perform joins in Cassandra. If you have designed a data model and find that you need something like a join, you'll have to either do the work on the client side, or create a denormalized second table that represents the join results for you.

Does Cassandra support foreign keys?

Apache Cassandra does not have the concept of foreign keys or relational integrity. Apache Cassandra's data model is based around designing efficient queries; queries that don't involve multiple tables. Relational databases normalize data to avoid duplication.

What are two logical entities in the Cassandra data model?

Tables are the entities of a database. Tables or column families are the entity of a keyspace. Row is an individual record in RDBMS.


2 Answers

The proper way to do this in Cassandra is denormalizing the data into 2 tables. You shouldn't worry about having to write twice, once on each table, as Cassandra is designed to handle writes very fast to support such model.

Take a look at this data modelling tutorials that will help understanding these things:

Data modelling tutorials

Also I see you mentioned sets as well. Just as a side note and although it is not an answer to your questions, you might want to be aware of some new features like: http://www.datastax.com/dev/blog/cql-in-2-1

like image 154
Bereng Avatar answered Nov 16 '22 03:11

Bereng


The way to achieve it is denormalizing data creating an actors_by_fans and a fans_by_actors. You can also use sets but this have limitations you already mentioned.

HTH, Carlo

like image 41
Carlo Bertuccini Avatar answered Nov 16 '22 01:11

Carlo Bertuccini