Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Howto avoid cassandra tombstones when inserting NULL values

My problem is that cassandra creates tombstones when inserting NULL values.

From what I understand, cassandra doesn't support NULLs and when NULL is inserted it just deletes the respective column. On one hand this is very space effective, however on the other hand it creates tombstones which degrades read performance.

This goes agains NoSql phillosophy because cassandra is saving space but degrading read performance. In NoSql world the space is cheap, however performance matters. I beleive this is the phillosophy behind saving tables in denormalized form.

I would like cassandra to use the same technique for inserting NULL as for any other value - use timestamping and during compaction preserve the latest entry - even if the entry is NULL (or we can call it "unset"). Is there any tweak in cassandra config or any approach how I would be able to achieve upserts with nulls without having tombstones ?

I came across this issue however it only allows to ignore NULL values

My use case: I have stream of events, every event identified by causeID. I'm receiving many events with same causeId and I want to store only the latest event for the same causeID (using upsert). The properties of the event may change from NULL to specific value, but also from specific value to NULL. Unfortunatelly the later case generates tombstones and degrades read performance.

Update

It seems there is no way how I could avoid tombstones. Could you advice me on techniques how to minimize them (set gc_grace_seconds to very low value). What are the risks, what to do when a node goes down for a longer period than gc_grace_seconds ?

like image 225
Tomas Bartalos Avatar asked Dec 27 '18 10:12

Tomas Bartalos


1 Answers

You can't insert NULL into Cassandra - it has special meaning there, and lead to creation of tombstones that you observe. If you want to treat NULL as special value, why not to solve this problem on application side - when you get null status, just insert any special value that couldn't be used in your table, and when you read data back, check for that special value and output null to requester...

like image 105
Alex Ott Avatar answered Oct 14 '22 07:10

Alex Ott