Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How Cassandra stores multicolumn primary key (CQL)

Tags:

cassandra

cql

I have a little misunderstanding about composite row keys with CQL in Cassandra. Let's say I have the following

cqlsh:testcql> CREATE TABLE Note (
           ... key int,
           ... user text,
           ... name text
           ... , PRIMARY KEY (key, user)
           ... );
cqlsh:testcql> INSERT INTO Note (key, user, name) VALUES (1, 'user1', 'name1');
cqlsh:testcql> INSERT INTO Note (key, user, name) VALUES (1, 'user2', 'name1');
cqlsh:testcql>
cqlsh:testcql> SELECT * FROM Note;

 key | user  | name
-----+-------+-------
   1 | user1 | name1
   1 | user2 | name1

How this data is stored? Are there 2 rows or one.

If two then how it is possible to have more than one row with the same key? If one then having records with key=1 and user from "user1" to "user1000" does it mean it will have one row with key=1 and 1000 columns containing names for each user?

Can someone explain what's going on on the background? Thanks.

like image 213
Vladimir Prudnikov Avatar asked Jul 17 '13 16:07

Vladimir Prudnikov


Video Answer


1 Answers

So, after diging a bit more and reading an article suggested by Lyuben Todorov (thank you) I found the answer to my question.

Cassandra stores data in data structures called rows which is totally different than relational databases. Rows have a unique key.

Now, what's happening in my example... In table Note I have a composite key defined as PRIMARY KEY (key, user). Only the first element of this key acts as a row key and it's called partition key. Internally the rest of this key is used to build a composite columns.

In my example

 key | user  | name
-----+-------+-------
   1 | user1 | name1
   1 | user2 | name1

This will be represented in Cassandra in one row as

-------------------------------------
|   | user1:name    | user2:name    |
| 1 |--------------------------------
|   | name1         | name1         |
-------------------------------------

Having know that it's clear that it's not a good idea to add any column with huge amount of unique values (and growing) to the composite key because it will be stored in one row. Even worse if you have multiple columns like this in a composite primary key.

Update: Later I found this blog post by Aaron Morton than explains the same in more details.

like image 53
Vladimir Prudnikov Avatar answered Oct 19 '22 12:10

Vladimir Prudnikov