I have a great deal of data to keep synchronized over 4 or 5 sites around the world, around half a terabyte at each site. This changes (either adds or changes) by around 1.4 Gigabytes per day, and the data can change at any of the four sites.
A large percentage (30%) of the data is duplicate packages (Perhaps packaged-up JDKs), so the solution would have to include a way of picking up the fact that there are such things lying aruond on the local machine and grab them instead of downloading from another site.
The control of versioning is not an issue, this is not a codebase per-se.
I'm just interested if there are any solutions out there (preferably open-source) that get close to such a thing?
My baby script using rsync doesn't cut the mustard any more, I'd like to do more complex, intelligent synchronization.
Thanks
Edit : This should be UNIX based :)
Data synchronization is the ongoing process of synchronizing data between two or more devices and updating changes automatically between them to maintain consistency within systems. While the sheer quantity of data afforded by the cloud presents challenges, it also provides the perfect solution for big data.
The Upside Benefits include: Automatic setup for new devices. While it's always fun to unpack a new device, the thought of setting it up can be enough to cause a headache. When your tech is synced, it makes the entire process quicker and easier.
Have you tried Unison?
I've had good results with it. It's basically a smarter rsync, which maybe is what you want. There is a listing comparing file syncing tools here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With