If I delete every keys in a ColumnFamily in a Cassandra db using remove(key)
, then if I use get_range_slices
, rows are still there but without columns. How could I remove entire rows?
Cassandra deletes data in each selected partition atomically and in isolation. Deleted data is not removed from disk immediately. Cassandra marks the deleted data with a tombstone and then removes it after the grace period.
String query1 = ”DELETE FROM emp WHERE emp_id=3; ”; session. execute(query); Given below is the complete program to delete data from a table in Cassandra using Java API.
I tested with cassandra 0.63 and the problem is still the same. I don't think that bug fix is for getting rid of the deleted row ids. See
http://wiki.apache.org/cassandra/FAQ#range_ghosts
for more information.
Why do deleted keys show up during range scans?
Because get_range_slice says, "apply this predicate to the range of rows given," meaning, if the predicate result is empty, we have to include an empty result for that row key. It is perfectly valid to perform such a query returning empty column lists for some or all keys, even if no deletions have been performed.
Cassandra uses Distributed Deletes as expected.
Thus, a delete operation can't just wipe out all traces of the data being removed immediately: if we did, and a replica did not receive the delete operation, when it becomes available again it will treat the replicas that did receive the delete as having missed a write update, and repair them! So, instead of wiping out data on delete, Cassandra replaces it with a special value called a tombstone. The tombstone can then be propagated to replicas that missed the initial remove request.
http://wiki.apache.org/cassandra/DistributedDeletes
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With