Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Delta encoding for JSON objects [closed]

Is there a standard library or tool out there for computing and applying differences to JSON documents? Basically I have a bunch of largish documents that I want to keep synchronized across a network, and I would prefer to avoid having to resend their entire state each time that I want to synchronize them (since many of these variables aren't going to change). In other words, I only want to transmit the fields which changed, not retransmit the entire object. I would think that it would be convenient to have something like the following set of methods:

//Start with two distinct objects on the server // prev represents a copy of the state of the object on the client // next represents a copy of the state of the object on the server // //1. Compute a patch patch = computePatch(prev, next);  //2. Send patch over the network  //3. Apply the patch on the client applyPatch(prev, patch);  //Final invariant: //   prev represents an equivalent object to JSON.parse(JSON.stringify(next)) 

I could certainly implement one myself, but there are quite a few edge cases that need to be considered. Here are some of the straightforward (though somewhat unsatisfactory) methods that I can think of such as:

  1. Roll my own JSON patcher. Asymptotically, this is probably the best way to go, since it would be possible to support all the relevant features of JSON documents, along with supporting some specialized methods for doing stuff like diffing ints, doubles and strings (using relative encoding/edit distance). However, JSON has a lot of special cases and I am a bit leery of trying to do this without a lot of testing, and so I would much prefer to find something that already solves this problem for me so that I can trust it, and not have to worry about network Heisenbugs showing up due to mistakes in my JSON patching

  2. Just compute the edit distance directly between the JSON strings using dynamic programming. Unfortunately, this doesn't work if the client and server have different JSON implementations (ie the order of their fields could be serialized differently), and it is also pretty expensive being a quadratic time operation.

  3. Use protocol buffers. Protocol buffers have a built in diff method which does exactly what I want, and they are a nice binary-serializable network friendly format. Unfortunately, because they are also strictly typed, they lack many of the advantages of using JSON such as the ability to dynamically add and remove fields. Right now this is the approach I am currently leaning towards, but it could make future maintenance really horrible as I would need to continually update each of my objects.

  4. Do something really nasty, like make a custom protocol for each type of object, and hope that I get it right in both places (yeah right!).

Of course what I am really hoping for is for someone here on stackoverflow to come through and save the day with a reference to a space efficient javascript object differ/patcher that has been well tested in production environments and across multiple browsers.

*Update*

I started writing my own patcher, an early version of it is available at github here:

https://github.com/mikolalysenko/patcher.js

I guess since there doesn't seem to be much out here, I will instead accept as an alternative answer a list of interesting test cases for a JSON patcher.

like image 921
Mikola Avatar asked Sep 06 '11 21:09

Mikola


People also ask

What is [] and {} in json?

' { } ' used for Object and ' [] ' is used for Array in json.

What is json encoding?

The default encoding is UTF-8, and JSON texts that are encoded in UTF-8 are interoperable in the sense that they will be read successfully by the maximum number of implementations; there are many implementations that cannot successfully read texts in other encodings (such as UTF-16 and UTF-32).

What type is Jsonify?

jsonify is a function in Flask's flask. json module. jsonify serializes data to JavaScript Object Notation (JSON) format, wraps it in a Response object with the application/json mimetype.


1 Answers

I've been mantaining a json diff & patch library at github (yes, shameless plug):

https://github.com/benjamine/JsonDiffPatch

it handles long strings automatically using Neil Fraser's diff_match_patch lib. it works both on browsers and server (unit tests running on both env). (full feature list is on project page)

The only thing you probably would need, that's not implemented is the option to inject custom diff/patch functions for specific objects, but that doesn't sound hard to add, you're welcome to fork it, and even better send a pull request.

Regards,

like image 63
Benja Avatar answered Oct 12 '22 06:10

Benja