Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

online bulk delete (truncate) of a cassandra keyspace

Tags:

cassandra

I read that once you drain a node you can delete the files and then restart. It works fine, but I tried it only by draining all nodes, shutting down the the whole cluster, deleting the files and restarting.

What happens if I restart only one node at the time? As far as I understood there is a risk that the restarted node will accept read requests and perform read repair using data from other replica.

Does anybody know the most failsafe procedure to truncate a keyspace while leaving the whole cluster up and running in order to serve other keyspaces?

like image 458
mkm Avatar asked May 19 '11 12:05

mkm


People also ask

How do I TRUNCATE a Cassandra table?

To remove all data from a table without dropping the table: If necessary, use the cqlsh CONSISTENCY command to set the consistency level to ALL . Use nodetool status or some other tool to make sure all nodes are up and receiving connections. Use TRUNCATE or TRUNCATE TABLE, followed by the table name.

Does TRUNCATE create tombstones in Cassandra?

truncate does not write tombstones at all (instead it will delete all on all nodes for your truncated table sstables immediately)


1 Answers

$ bin/cassandra-cli -h localhost
[default@unknown] use keyspace1;
Authenticated to keyspace: Keyspace1
[default@Keyspace1] truncate standard1;     
standard1 truncated.

By design, this is not race-proof (that would require heavyweight locking); normally you would only atruncate a CF that isn't serving live reads anyway. but if for some reason you must, disable read repair first ("update column family standard1 with read_repair_chance=0").

like image 137
jbellis Avatar answered Nov 15 '22 10:11

jbellis