Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Transaction with Cassandra data model

According to the CAP theory, Cassandra can only have eventually consistency. To make things worse, if we have multiple reads and writes during one request without proper handling, we may even lose the logical consistency. In other words, if we do things fast, we may do it wrong.

Meanwhile the best practice to design the data model for Cassandra is to think about the queries we are going to have, and then add a CF to it. In this way, to add/update one entity means to update many views/CFs in many cases. Without atomic transaction feature, it's hard to do it right. But with it, we lose the A and P parts again.

I don't see this concerns many people, hence I wonder why.

  • Is this because we can always find a way to design our data model to avoid to do multiple reads and writes in one session?
  • Is this because we can just ignore the 'right' part?
  • In real practice, do we always have ACID feature somewhere in the middle? I mean maybe implement in application layer or add a middleware to handle it?
like image 271
aXqd Avatar asked Nov 04 '22 15:11

aXqd


1 Answers

It does concern people, but presumably you are using cassandra because a single database server is unable to meet your needs due to scaling or reliability concerns. Because of this, you are forced to work around the limitations of a distributed system.

In real practice, do we always have ACID feature somewhere in the middle? I mean maybe implement in application layer or add a middleware to handle it?

No, you don't usually have acid somewhere else, as presumably that somewhere else must be distributed over multiple machines as well. Instead, you design your application around the limitations of a distributed system.

If you are updating multiple columns to satisfy queries, you can look at the eventually atomic section in this presentation for ideas on how to do that. Basically you write enough info about your update to cassandra before you do your write. That way if the write fails, you can retry it later.

If you can structure your application in such a way, using a co-ordination service like Zookeeper or cages may be useful.

like image 193
sbridges Avatar answered Nov 15 '22 12:11

sbridges