Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cassandra - What is meant by - "cannot rename non primary key part"

Tags:

cassandra

cql

I have created a table users as follows:

create table users (user_id text primary key, email text, first_name text, last_name text, session_token int);

I am referring to the CQL help documentation on the DataStax website.

I now want to rename the email column to "emails". But I when I execute the command -

alter table users rename email to emails;

I am getting the error - Bad Request: cannot rename non primary key part email

I am using CQL 3 . My CQLSH is 3.1.6 and C* is 1.2.8.

Why cannot I rename the above column? If I run help alter table, it shows the option to rename the column. How do I rename the column?

like image 897
Sumod Avatar asked Jan 16 '14 08:01

Sumod


People also ask

How do I rename a non primary key in Cassandra?

The name of the primary key is not written into each row, rather it is stored in a different place that's easily changeable. But for non-primary key fields, the names of the fields are written into each row. In order to rename the column, the system would have to rewrite every single row.

Can we change the primary key of a table in Cassandra?

There is no way to change a primary key, as it defines how your data is physically stored. You can create a new table with the new primary key, copy data from the old one, and then drop the old table.

Can we rename a column in Cassandra?

In Cassandra, to modify a column by executing an ALTER statement. Using ALTER table statement You can change the type of a column, Add a new column, Drop a column, Rename existing column as shown here: ALTER TABLE [keyspace_name.]

Can we alter clustering key in Cassandra?

Based on how Cassandra stores the data, updating clustering columns (Primary key) is not possible.


1 Answers

In CQL, you can rename the column used as the primary key, but not any others. This seems opposite from what it should be, one would think that the primary key would need to stay the same and the others would be easy to change! The reason comes from implementation details.

The name of the primary key is not written into each row, rather it is stored in a different place that's easily changeable. But for non-primary key fields, the names of the fields are written into each row. In order to rename the column, the system would have to rewrite every single row.

This article has some fantastic examples and a much longer discussion of Cassandra's internals.

To borrow an example directly from the article, consider this example column family:

cqlsh:test> CREATE TABLE example (
            ... field1 int PRIMARY KEY,
            ... field2 int,
            ... field3 int);

Insert a little data:

cqlsh:test> INSERT INTO example (field1, field2, field3) VALUES ( 1,2,3);

And then the Cassandra-CLI output (not CQLSH) from querying this column family:

[default@test] list  example;
-------------------
RowKey: 1
  => (column=, value=, timestamp=1374546754299000)
  => (column=field2, value=00000002, timestamp=1374546754299000)
  => (column=field3, value=00000003, timestamp=1374546754299000)

The name of the primary key, "field1" is not stored in any of the rows, but "field2" and "field3" are written out, so changing those names would require rewriting every row.

So if you really still want to rename a non-primary column, there are basically two different strategies and neither of them are very desirable.

  1. Drop the column and add it back, as another poster mentioned. This has the big downside of dropping all the data in that column.

or

  1. Create a new column family that is basically a copy of the old but with the column in question renamed and rewrite your data there. This is, of course, very computationally expensive.
like image 105
framebit Avatar answered Oct 13 '22 00:10

framebit