Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Database Synchronization Algorithm Advice

I'm working on an application that needs an algorithm for data synchronization to be implemented.

We'd be having a main server , and multiple subordinate devices , which would need to be synced together.

Now , I have three algorithms and I'd like advice on which one would be the best according to any of you.I'd really really appreciate your opinions.

1. A description of the algorithm can be found here.Its a scientific research paper by Sang-Wook Kim Division of Information and Communications Hanyang University, Korea

http://goo.gl/yFCHG

2 This algorithm involves maintaing a record of timestamps and version numbers of databases

If for instance , one has version v10 , on one’s mobile device and the server , has v12 , the mobile, assuming that the current timestamp on the mobile device is less recent as compared to the timestamp on the server,

If we denote a deletion by - , an insertion by a + and a change by ~

And the following change logs are associated with a few versions :

v11: +r(44) , ~r(45),-r(46) v12: -r(44),~r(45),+r(47)

Then the overall change in the database is , ~r(45) ( from v12),+r(47),-r(46)

Hence it can be seen that the record r(44) , wasn’t needed ,even though it was added, and then subsequently deleted. Hence no redundant data needs to be transferred.

The whole algorithm can be found here ( I've put it up in a pdf ) http://goo.gl/yPC7A

3 This algorithm in effect - keeps a table that records the last change timestamp for each record.And keeps rows sorted according to timestamp.It synchronizes only those rows that have been changed ,the only con i see here is sorting the table each time according to timestamps .

Here's a link http://goo.gl/8enHO

Thanks a ton for your opinions ! :D

like image 251
Hormigas Avatar asked Mar 17 '13 06:03

Hormigas


1 Answers

I have not been involved in this directly myself, but I have been around when people were working on this sort of thing. Their design was driven not by algorithm analysis or a search for performance, but by hours spent talking to representatives of the end users about what to do when conflicting update requests were received. You might want to work through some use cases with users. It is even possible that users will want different sorts of conflict resolution for different sorts of data in different places.

All the designs here save bandwidth by propagating changes. If something ever causes one side to stop being an exact copy of the other, this inconsistency can persist indefinitely. You could at least detect such a problem by exchanging checksums (SHA-2 or SHA-3 if you are worried enough). One idea is to ask the recipient system for a checksum and then select an package of updates based on that checksum.

like image 165
mcdowella Avatar answered Oct 28 '22 13:10

mcdowella