I'm working on an application that needs an algorithm for data synchronization to be implemented.
We'd be having a main server , and multiple subordinate devices , which would need to be synced together.
Now , I have three algorithms and I'd like advice on which one would be the best according to any of you.I'd really really appreciate your opinions.
1. A description of the algorithm can be found here.Its a scientific research paper by Sang-Wook Kim Division of Information and Communications Hanyang University, Korea
http://goo.gl/yFCHG
2 This algorithm involves maintaing a record of timestamps and version numbers of databases
If for instance , one has version v10 , on one’s mobile device and the server , has v12 , the mobile, assuming that the current timestamp on the mobile device is less recent as compared to the timestamp on the server,
If we denote a deletion by - , an insertion by a + and a change by ~
And the following change logs are associated with a few versions :
v11: +r(44) , ~r(45),-r(46) v12: -r(44),~r(45),+r(47)
Then the overall change in the database is , ~r(45) ( from v12),+r(47),-r(46)
Hence it can be seen that the record r(44) , wasn’t needed ,even though it was added, and then subsequently deleted. Hence no redundant data needs to be transferred.
The whole algorithm can be found here ( I've put it up in a pdf ) http://goo.gl/yPC7A
3 This algorithm in effect - keeps a table that records the last change timestamp for each record.And keeps rows sorted according to timestamp.It synchronizes only those rows that have been changed ,the only con i see here is sorting the table each time according to timestamps .
Here's a link http://goo.gl/8enHO
Thanks a ton for your opinions ! :D
I have not been involved in this directly myself, but I have been around when people were working on this sort of thing. Their design was driven not by algorithm analysis or a search for performance, but by hours spent talking to representatives of the end users about what to do when conflicting update requests were received. You might want to work through some use cases with users. It is even possible that users will want different sorts of conflict resolution for different sorts of data in different places.
All the designs here save bandwidth by propagating changes. If something ever causes one side to stop being an exact copy of the other, this inconsistency can persist indefinitely. You could at least detect such a problem by exchanging checksums (SHA-2 or SHA-3 if you are worried enough). One idea is to ask the recipient system for a checksum and then select an package of updates based on that checksum.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With