Suppose we have such table:
create table users (
id text,
roles set<text>,
PRIMARY KEY ((id))
);
I want all the values of this table to be stored on the same Cassandra node (OK, not really the same, same 3, but have all the data mirrored, but you got the point), so to achieve that i want to change this table to be like this:
create table users_v2 (
partition int,
id text,
roles set<text>,
PRIMARY KEY ((partition), id)
);
How can i do that without losing the data from the first table? It seems to be impossible to ALTER TABLE in order to add such column. i'm OK with that. What i try to do is to copy data from the first table and insert to the second table. When i do it as it is, the partition column іs missing, which is expected. I can ALTER the first table and add a 'partition' column to the end, and then COPY in correct order, but i can't update all the rows in the first table to set the all some partition, and it seems to be no "default" value when column is added.
You simply cannot alter the primary key of a Cassandra table. You need to create another table with your new schema and perform a data migration. I would suggest that you use Spark for that since it is really easy to do a migration between two tables with only a few lines of code.
This also answer to the alter primary key question.
If you have not a lot of data in table there is another way. In utility "DataStax Dev Center", select table and use command "Export All result to file as INSERT". It will save all data from table to file with Insert CQL-instructions.
Then you should drop table, create new one with new PARTITION KEY and finally fill it by instructions from file via CQL.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With