Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to sync between two Azure storage (blobs) hosted on two different data centers

We are planning to deploy our azure web application to two separate data centers (one located in West Europe and the other located in Southeast Asia) for purely performance reasons. We allow users to upload files which means we need to keep the blob storage of the two data centers in sync. I know Azure provides support for synchronizing structured data but there seems to be no such support for blob synchronization. My questions is:

Is there a service that provides blob synchronization between different data centers? if not, how can I implement one? I see many samples on the web to sync between Azure blob storage and local file system and vice versa but not between data centers.

like image 959
Suresh Kumar Avatar asked Feb 06 '14 08:02

Suresh Kumar


1 Answers

Is there a service that provides blob synchronization between different data centers?

No. Currently no such service exists out of the box which would synchronize content between 2 data centers.

if not, how can I implement one?

Although all the necessary infrastructure is available for you to implement this, the actual implementation would be tricky.

First you would need to decide if you want real-time synchronization or will a batched synchronization would do?

For realtime synhroniztion you could rely on Async Copy Blob. Using async copy blob you can actually instruct the storage service to copy blob from one storage account to another instead of manually download the blob from source and uploading to target. Assuming all uploads are happening from your application, as soon as a blob is uploaded you would know in which datacenter it is being uploaded. What you could do is create a SAS URL of this blob and initiate an async copy to the other datacenter.

For batched synchronization, you would need to query both storage accounts and list blobs in each blob container. In case the blob is available in just one storage account and not other, then you could simply create the blob in destination storage account by initiating async copy blob. Things would become trickier if the blob (by the same name) is present in both storage accounts. In this case you would need to define some rules (like comparing modified date etc.) to decide whether the blob should be copied from source to destination storage account.

For scheduling the batch synchronization, you could make use of Windows Azure Scheduler Service. Even with this service, you would need to write code for synchronization logic. Scheduler service will only take care of scheduling part. It won't do the actual synchronization.

I would recommend making use of a worker role to implement synchronization logic. Another alternative is Web Jobs which are announced recently though I don't know much about it.

like image 112
Gaurav Mantri Avatar answered Dec 26 '22 09:12

Gaurav Mantri