Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is CRDT in Distributed Systems?

Tags:

I am a newbie in Distributed systems and I am trying to get an insight on the concept of CRDT. I realize that it has three notations :

Conflict-free Replicated Data Type Convergent Replicated Data Type Commutative Replicated Data Type 

Can anyone give an example where we use CRDT in distributed systems? Thanks a lot in advance.

like image 455
fnaticRC ggwp Avatar asked Dec 10 '15 01:12

fnaticRC ggwp


People also ask

How CRDT work?

With state-based CRDTs, the client doesn't pass on the operation(s) it applied to change the data. Instead, it sends the new state of the data (as a CRDT) to all other clients. Clients can merge with their own changes because the CRDTs conform to a consistent policy to resolve conflicts and eventually converge.

When to use CRDT?

Select the right CRDT that fits your use case. The counter is the simplest of the CRDTs. It can be applied for use cases such as global voting, tracking active sessions, metering, etc. However, if you want to merge the state of distributed objects, then you must consider other data structures, too.

What is CRDT algorithm?

Conflict-free Replicated Data Types (CRDTs) are an increasingly popular family of algorithms for optimistic replication. They allow data to be concurrently updated on several replicas, even while those replicas are offline, and provide a robust way of merging those updates back into a consistent state.

Why CRDT?

CRDT is capable of working peer-to-peer with end-to-end encryption; if a server is used at all it only needs to coordinate connections between clients. It is resilient to transient network connections. It even works if clients go offline for a period of time, make changes, and synchronise when the network returns.


2 Answers

CRDTs are inspired by the work of Marc Shapiro. In distributed computing, a conflict-free replicated data type (abbreviated CRDT) is a type of specially-designed data structure used to achieve strong eventual consistency (SEC) and monotonicity (absence of rollbacks). There are two alternative routes to ensuring SEC: operation-based CRDTs and state-based CRDTs.

CRDTs on different replicas can diverge from one another but at the end they can be safely merged providing an eventually consistent value. In other words, CRDTs have a merge method that is idempotent, commutative and associative.

The two alternatives are equivalent, as one can emulate the other, but operation-based CRDTs require additional guarantees from the communication middleware. CRDTs are used to replicate data across multiple computers in a network, executing updates without the need for remote synchronization. This would lead to merge conflicts in systems using conventional eventual consistency technology, but CRDTs are designed such that conflicts are mathematically impossible. Under the constraints of the CAP theorem they provide the strongest consistency guarantees for available/partition-tolerant (AP) settings.

Some examples where they are used

Riak is the most popular open source library of CRDT's and is used by Bet365 and League of Legends. Below are some useful links that supports Riak.

1- Bet365 (Uses Erlang and Riak) http://www.erlang-factory.com/static/upload/media/1434558446558020erlanguserconference2015bet365michaelowen.pdf

2- League of Legends uses the Riak CRDT implementation for its in-game chat system (which handles 7.5 million concurrent users and 11,000 messages per second)

3- Roshi implemented by SoundCloud that supports a LWW time-stamped Set: -Blog post: https://developers.soundcloud.com/blog/roshi-a-crdt-system-for-timestamped-events

like image 167
Metin Dagcilar Avatar answered Sep 21 '22 13:09

Metin Dagcilar


CRDTs use Math to enforce consistency across a distributed cluster, without having to worry about consensus and associated latency/unavailability.

The set of values that a CRDT can take at anytime come under the category of a semi-lattice (specifically a join semi-lattice), which is a POSET (partially-ordered set) with a least upper bound function (LUB).

In simple terms, a POSET is a collection of items in which not all are comparable. E.g. in an array of pairs: {(2,4), (4, 5), (2, 1), (6, 3)}, (2,4) is < (4,5), but can't be compared with (6,3) (since one element is larger and the other smaller). Now, a semi-lattice is a POSET in which given 2 pairs, even if you can't compare the two, you can find a element greater than both (LUB).

Another condition is that updates to this datatype need to be increasing, CRDTs have monotonically increasing state, where clients never observe state rollback.

This excellent article uses the array I used above as an example. For a CRDT maintaining those values, if 2 replicas are trying to achieve consensus between (4,5) and (6,3), they can pick a LUB = (6,5) as consensus and assign both replicas to it. Since the values are increasing, this is a good value to settle on.

There's 2 ways for CRDTs to keep in sync with each other across replicas, they can transfer state across periodically (convergent replicated data type), or they can transfer updates (deltas) across as they get them (commutative replicated data type). The former takes more bandwidth.

SoundCloud's Roshi is a good example (though no-longer in development it seems), they store data associated with a timestamp, where the timestamp is obviously incrementing. Any updates coming in with a timestamp lesser or equal than the one stored is discarded, which ensures idempotency (repeated writes are OK) and commutativity (out of order writes are ok. Commutativity is a=b means b=a, which in this case means update1 followed by update2 is same as update2 followed by update1)

Writes are sent to all clusters, and if certain nodes fail to respond due to an issue like slowness or partition, they're expected to catch up later via a read-repair, which ensures that the values converge. The convergence can be achieved via 2 protocols as I mentioned above, propagate state or updates to other replicas. I believe Roshi does the former. As part of the read-repair, replicas exchange state, and because data adheres to the semi-lattice property, they converge.

PS. Systems using CRDTs are eventually consistent, i.e they adopt AP (highly available and partition-tolerant) in the CAP theorem.

Another excellent read on the subject.

like image 43
Siddhartha Avatar answered Sep 19 '22 13:09

Siddhartha