Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to sync large lists between client and server

I'd like to sync a large list of items between the client and the server. Since the list is pretty big I can't sync it in a single request so, how can I ensure the list to be synched with a reasonable amount of calls to the synchronization service?

For example:

And I want to sync a list with 100.000 items so I make a web service with the following signature

getItems(int offset,int quantity): Item[]

The problem comes when, between call and call, the list is modified. For example:

 getItems(0,100)  : Return items (in the original list) [0,100)
 getItems(100,100): Return items (in the original list) [100,200)
 ##### before the next call the items 0-100 are removed ####
 getItems(200,100): Return items (in the original list) [300,400)

So the items [200,300) are never retrieved. (Duplicated items can also be retrieved if items are added instead of removed.

How can I ensure a correct sync of this list?

like image 703
Addev Avatar asked Mar 25 '14 13:03

Addev


2 Answers

  1. From time to time, the service should save immutable snapshots. The interface should be getItems(long snapshotNumber, int offset,int quantity)

  2. to save time, space, and traffic, not every modification of the list should form a snapshot, but every modification should form a log message (e.g. add items, remove range of items), and that log messages should be send to the client instead of full snapshots. Interface can be getModification(long snapshotNumber, int modificationNumber):Modification.

like image 198
Alexei Kaigorodov Avatar answered Oct 14 '22 16:10

Alexei Kaigorodov


Can you make the list ordered on some parameter on the server side? For e.g. a real world use-case for this scenario is showing records in a table on UI. The number of records on the server side can be huge so you wouldn't want to get the whole list at once and instead you get them on each scroll that the user makes.

In this case, if the list is ordered, you get a lot of things for free. And your API becomes getItems(long lastRecordId,int quantity). Here lastRecordId would be a unique key identifying that particular record. You use this key to calculate the offset (on the server side) and retrieve the next batch from this offset location and return the recordId of the last record to the client which it uses in its next API call.

You don't have to maintain snapshots and there wouldn't be any duplicate records retrieved. The scenarios that you mention in case of removals/insertions don't occur in this case. But at some point in time, you would have to discard the copy that the client has and start syncing all over again if you want to track additions and removals on the client side for the data that the client has already seen.

like image 45
user1168577 Avatar answered Oct 14 '22 16:10

user1168577