Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the best way to sync large amounts of data around the world?

I have a great deal of data to keep synchronized over 4 or 5 sites around the world, around half a terabyte at each site. This changes (either adds or changes) by around 1.4 Gigabytes per day, and the data can change at any of the four sites.

A large percentage (30%) of the data is duplicate packages (Perhaps packaged-up JDKs), so the solution would have to include a way of picking up the fact that there are such things lying aruond on the local machine and grab them instead of downloading from another site.

The control of versioning is not an issue, this is not a codebase per-se.

I'm just interested if there are any solutions out there (preferably open-source) that get close to such a thing?

My baby script using rsync doesn't cut the mustard any more, I'd like to do more complex, intelligent synchronization.

Thanks

Edit : This should be UNIX based :)

like image 580
Spedge Avatar asked Oct 24 '08 15:10

Spedge


People also ask

What is data synchronization in IOT?

Data synchronization is the ongoing process of synchronizing data between two or more devices and updating changes automatically between them to maintain consistency within systems. While the sheer quantity of data afforded by the cloud presents challenges, it also provides the perfect solution for big data.

Is it a good idea to sync my devices?

The Upside Benefits include: Automatic setup for new devices. While it's always fun to unpack a new device, the thought of setting it up can be enough to cause a headache. When your tech is synced, it makes the entire process quicker and easier.


1 Answers

Have you tried Unison?

I've had good results with it. It's basically a smarter rsync, which maybe is what you want. There is a listing comparing file syncing tools here.

like image 70
Vinko Vrsalovic Avatar answered Nov 16 '22 00:11

Vinko Vrsalovic