Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Conflict-free Replicated Data Types (CRDT) vs Paxos or Raft

When is it a good idea to use something like CRDT instead of paxos or raft?

like image 683
Eric des Courtis Avatar asked Jun 28 '12 23:06

Eric des Courtis


3 Answers

If you can use something like CRDT, do so. You should get much better performance. It also enables interesting use cases such as working offline and then merging later. However it is not always possible to design things such that a CRDT will work for you. In that case, paxos can solve the hard problems for you.

But even if you've decided to use paxos, generally you should limit how much work is being done directly through the paxos algorithm. Instead for performance reasons you want to reserve paxos for necessary operations such as master election, and then let a replicated master setup handle most decisions. (In a high throughput environment the master is likely to do something like delegate responsibility for specific shards to specific children, which replicate off each other. Do not let the master become a bottleneck...)

That said, it is much easier to claim that you'll wave the magic wand of paxos than it is to actually do it in practice. In that light you may find http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/archive/chubby-osdi06.pdf to be an interesting description of the difficulties that a real-world paxos implementation is likely to encounter.

like image 181
btilly Avatar answered Sep 29 '22 12:09

btilly


I think this guy know what he is talking about:

Blog

Video

Conclusion about distributed systems

like image 15
Antiarchitect Avatar answered Sep 29 '22 14:09

Antiarchitect


CRDTs and Paxos have different goals and are used under different scenarios. They have in common that they help programmers deal with concurrency/replication. CRDTs are data types that asume concurrent updates will occur. Paxos is a protocol that enforces they wont, by enforcing a total order on them. Let's see this in more detail.

Lets say we have a replicated set which is replicated at two different places.

Using Paxos guarantees that writes to the set will be executed by every replica in the same order. More generally, it guarantees that all replicas AGREE on how the state of the set evolves.

If you have, for example, user1 performing update1 at replica1, adding element 1 to the replicated set while simultaneously user2 performs update2, adding element2 at replica2, Paxos will make replicas agree on a given order for those updates, or possibly agree on choosing one of the two updates and discarding the second one, depending on how you use it and what you want to achieve. If Paxos outcome is, say, that update1 comes before update2, every replica will update the set in that order. As a consequence, users reading the set concurrently with those updates can observe, regardless of where (at which replica) they read, ONLY the following states of the set (assuming the set was empty at the beggining):

{} (empty set)

{element1}

{element1, element2}

Furthermore, these states can be seen ONLY in that order, meaning that once the state of the set is {element1, element2} (at every replica), no subsequent read will return {} or {element1}.

Positive side: This set is simple to reason about, as it is equivalent to a set that is not replicated.

Negative side: Unavailability: If replicas can't talk to each other (network partition), your set can't be updated, as there can be no agreement. Low performance, high-latency: Agreement require that replicas synchronize before replying to the client. This incurs latency proportional to the latency between replicas.

CRDTs have weaker guarantees. A CRDT set is not equivalent to a sequential, single-copy one. It asumes that there is no agreement or total order on how replicas are updated.

CRDTs guarantee that if both replicas of the set have seen the same updates (regardless of the order in which they see them), then they will exhibit the same state; replicas will converge.

In our example of two users performing updates concurrently, a system that does not run Paxos to order operations on the set (this happens, e.g., under eventual or causal consistency), will allow replica1 to add element1 while replica2 is adding element2

so, the state at replica1 will be: {element1}

and the state at replica2 will be: {element2}

At this point in time, replicas diverge. Later, when replicas synchronise, they will exchange their updates, finally exhibiting this state:

state at replica1 will be: {element1, element2}

state at replica2 will be: {element2, element1}

At this point in time, replicas have converged.

Users reading the set concurrently with those updates can observe, depending of where (at which replica) they read, the following states of the set (assuming the set was empty at the beggining):

{} (empty set)

{element1} (if they read from replica1)

{element2} (if they read from replica2)

{element1, element2}

{element2, element1}

Negative side: This set is hard to reason about, as it shows states that could not occur in a sequential set. In our example, we have observed only the case of two concurrent adds to a set, which is straightforward. Concurrent adds and remove are more complex There are many datatypes with different issues:

A comprehensive study of Convergent and Commutative Replicated Data Types

Positive side: High-availability: If replicas can't talk to each other (network partition), your set CAN be updated. Replicas will sync when they connect back. High performance, low-latency: Replicas immediately reply to clients and synchronize in the background, after replying to the client.

like image 10
alek Avatar answered Sep 29 '22 12:09

alek