Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Google Docs operational transformation err on the side of deletion?

Tried out this experiment today: opened two offline editors for a Google document. In one, I bolded the first word. In the second, I deleted it. Regardless of which client I turn on first, the word always ends up deleted.

First off, why is this the case - my understanding of operational transformation is that ordering matters? In the simple example of two people typing "a" and "b" respectively, if the server receives "a" first, it will enforce the output of "ab" by transforming the second person's "b" event into a "pass one space, then add b" event, and vice versa.

Secondly, if ordering doesn't matter, are there technical reasons as to why Google Docs has chosen to err on the side of deletion? Or are the reasons largely simplicity for users?

like image 728
ehfeng Avatar asked Oct 14 '12 10:10

ehfeng


People also ask

What is operational transformation in Google Docs?

Operational transformation (OT) is a technology for supporting a range of collaboration functionalities in advanced collaborative software systems. OT was originally invented for consistency maintenance and concurrency control in collaborative editing of plain text documents.

How do I get Google Docs back to normal view?

On your computer, open a document, spreadsheet, or presentation. On the toolbar, click View, then click an option. Exit full screen. Show print layout: This option is only available in Google Docs that are in pages format.

Why did my Google Doc edits disappear?

There are some possible reasons that cause the Google Docs unsaved changes to Drive Google Docs not saving issue: Incorrect or unstable network connection destroys the automatic save function. Temporary technical problems caused either by use-side network issues or bugs from Google Docs.


3 Answers

Here is (5 years later I know) a graphical explanation of what why this happens. This is, in fact, what @osma describes but graphically explained:

When you bold a string in GDocs you are wrapping the string into a container, presumably <strong></strong> but they may use any other syntax. For simplicity lets just say that bold'ing a string just requires a "+" at the beginning of the word. So that, for simplicity, the text "lorem ipsum" would become lorem +ipsum and not lorem <strong>ipsum<strong>

1

Both Alice and Bob start with the text "Lorem ipsum" enter image description here

2

Bob then deletes "ipsum". Notice that he sends the changeset retain(6), delete(5) to the server. A changeset is essentially a patch, Google probably used this library. enter image description here

3

Now Alice bolds "ipsum" (adding "+"). She sends is the changeset retain(6), insert(+), retain(5) enter image description here

4

Both changesets are traveling to the server. The server knows nothing about these sets yet. enter image description here

5

Assuming the worst scenario: Bob's package arrives first and then the word will be deleted. The other scenario is obvious. enter image description here

6

When Alice's package arrives, it will only add a "+" to the text because what she sent is only a single changeset. enter image description here

7

Both texts are then broadcasted to the clients. This is the first one. enter image description here

8

And this is the second one. enter image description here

9

After patching these changesets into the original text you end up with "Lorem +". The server and all clients now have the same text. The + symbol would later be erased by an common HTML clean process which eliminates empty tags like <tag></tag>,

enter image description here

To test this demo go to: http://operational-transformation.github.io/visualization.html. There you can play with the texts and packages as they are sent/received.

like image 195
adelriosantiago Avatar answered Oct 17 '22 14:10

adelriosantiago


It's not a question of erring on the side of deletion.

In cases where both clients have equality valid but differing versions of truth, Google Docs must elect to uphold one version, or else force users to resolve conflicts, something that is inherently complicated and hard to explain.

Thus, "truth" for Google Docs is consistency of the document, not discernment of intent. And consistency is best more easily achieved through destruction of information - a sort of tendency to entropy.

All this is basically my semi-philosophical BS though...

like image 31
ehfeng Avatar answered Oct 17 '22 13:10

ehfeng


OT does not try to discern intent, it applies transformations in an order which produces a consistent result. When you apply both of those changes to a document, it does not matter which order you apply them in.

"first second" -> "first second" -> "first"

"first second" -> "first" -> "first"

In the second stream, the bold operation is performed on a zero-length string.

This is the exact same result you would get if in one of those documents you had italicized the second word: the end result would be "first second" regardless of transformation order. Delete transformation is no different.

like image 31
osma Avatar answered Oct 17 '22 12:10

osma