When issuing Delete
to hbase, I am aware that it does not remove the data, immediatelly. But when does deleting data happens, I mean, physically?
When a Delete command is issued through the HBase client, no data is actually deleted. Instead a tombstone marker is set, making the deleted cells effectively invisible.
Using deleteall command Use deleteall to remove a specified row from an HBase table. This takes table name and row as a mandatory argument; optionally column and timestamp.
HBase Time to Live (TTL) Option – Automatically Delete HBase Row. You can set ColumnFamilies a TTL length in seconds, and HBase will automatically delete rows or automatically expires the row once the expiration time is reached. This setting applies to all versions of a row in that table– even the current one.
When you write something to HBase, it gets stored in memstore (RAM) and then gets written to the disk after that. These disk writes are generally immutable barring compactions.
Deletes are taken care of during major compactions in hbase - these run about every 24 hours & can be triggered via the API or shell. Major compactions process delete markers - minor compactions don't.
When you issue normal deletes, it results in a delete (tombstone) marker - these delete markers & the data they represent are removed during compaction (not present in the merged file post compaction).
Also, if you delete data and put more data but with an earlier timestamp than the tombstone timestamp (& which meets the criteria of the earlier delete), further gets may be masked by the delete/tombstone marker (only to be fixed after major compaction has run) & hence you will not receive the inserted value till after major compaction in this case.
hope it helps
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With