According to the CAP theory, Cassandra can only have eventually consistency. To make things worse, if we have multiple reads and writes during one request without proper handling, we may even lose the logical consistency. In other words, if we do things fast, we may do it wrong.
Meanwhile the best practice to design the data model for Cassandra is to think about the queries we are going to have, and then add a CF to it. In this way, to add/update one entity means to update many views/CFs in many cases. Without atomic transaction feature, it's hard to do it right. But with it, we lose the A and P parts again.
I don't see this concerns many people, hence I wonder why.
It does concern people, but presumably you are using cassandra because a single database server is unable to meet your needs due to scaling or reliability concerns. Because of this, you are forced to work around the limitations of a distributed system.
In real practice, do we always have ACID feature somewhere in the middle? I mean maybe implement in application layer or add a middleware to handle it?
No, you don't usually have acid somewhere else, as presumably that somewhere else must be distributed over multiple machines as well. Instead, you design your application around the limitations of a distributed system.
If you are updating multiple columns to satisfy queries, you can look at the eventually atomic section in this presentation for ideas on how to do that. Basically you write enough info about your update to cassandra before you do your write. That way if the write fails, you can retry it later.
If you can structure your application in such a way, using a co-ordination service like Zookeeper or cages may be useful.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With