Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Differences between OT and CRDT

Can someone explain me simply the main differences between Operational Transform and CRDT?

As far as I understand, both are algorithms that permits data to converge without conflict on different nodes of a distributed system.

In which usecase would you use which algorithm? As far as I understand, OT is mostly used for text and CRDT is more general and can handle more advanced structures right?

Is CRDT more powerful than OT?


I ask this question because I am trying to see how to implement a collaborative editor for HTML documents, and not sure in which direction to look first. I saw the ShareJS project, and their attempts to support rich text collaboration on the browser on contenteditables elements. Nowhere in ShareJS I see any attempt to use CRDT for that.

We also know that Google Docs is using OT and it's working pretty well for real-time edition of rich documents. Is Google's choice of using OT because CRDT was not very known at that time? Or would it be a good choice today too?

I'm also interested to hear about other use cases too, like using these algorithms on databases. Riak seems to use CRDT. Can OT be used to sync nodes of a database too and be an alternative to Paxos/Zab/Raft?

like image 945
Sebastien Lorber Avatar asked Nov 01 '14 23:11

Sebastien Lorber


People also ask

How do you use crdt?

A G-counter is a perfect example of an operational CRDT that merges the operations. Here, a + b = b + a and a + (b + c) = (a + b) + c. The replicas exchange only the updates (additions) with each other. The CRDT will merge the updates by adding them up.

What is the meaning of CRDTs?

CRDT stands for Conflict-free Replicated Data Type. In sum, CRDTs provide a way for you to merge concurrent modifications, always, in any order. Let's talk more about what CRDTs are, how they work, plus what they mean for serverless multi-region and failover.

Does Google Docs use operational transformation?

In Google Docs, the first problem is handled by operational transformation and the second problem is handled by the collaboration protocol, which is the subject of this post. To open a Google document, you need code running in two places: your browser and our servers.


1 Answers

Both approaches are similar in that they provide eventual consistency. The difference is in how they do it. One way of looking at it is:

  • OT does it by changing operations. Operations are sent over the wire and concurrent operations are transformed once they are received.
  • CRDTs do it by changing state. Operations are made on the local CRDT. Its state is sent over the wire and is merged with the state of a copy. It doesn't matter how many times or in what order merges are made - all copies converge.

You're right, OT is mostly used for text and does predate CRDTs but research shows that:

many OT algorithms in the literature do not satisfy convergence properties unlike what was stated by their authors

In other words CRDT merging is commutative while OT transformation functions sometimes are not.

From the Wikipedia article on CRDT:

OTs are generally complex and non-scalable

There are different kinds of CRDTs (sets, counters, ...) suited for different kinds of problems. There are some that are designed for text editing. For example, Treedoc - A commutative replicated data type for cooperative editing.

like image 58
Andrejs Avatar answered Sep 22 '22 09:09

Andrejs