We are planning to deploy our azure web application to two separate data centers (one located in West Europe and the other located in Southeast Asia) for purely performance reasons. We allow users to upload files which means we need to keep the blob storage of the two data centers in sync. I know Azure provides support for synchronizing structured data but there seems to be no such support for blob synchronization. My questions is:
Is there a service that provides blob synchronization between different data centers? if not, how can I implement one? I see many samples on the web to sync between Azure blob storage and local file system and vice versa but not between data centers.
Is there a service that provides blob synchronization between different data centers?
No. Currently no such service exists out of the box which would synchronize content between 2 data centers.
if not, how can I implement one?
Although all the necessary infrastructure is available for you to implement this, the actual implementation would be tricky.
First you would need to decide if you want real-time synchronization or will a batched synchronization would do?
For realtime synhroniztion you could rely on Async Copy Blob
. Using async copy blob you can actually instruct the storage service to copy blob from one storage account to another instead of manually download the blob from source and uploading to target. Assuming all uploads are happening from your application, as soon as a blob is uploaded you would know in which datacenter it is being uploaded. What you could do is create a SAS URL of this blob and initiate an async copy to the other datacenter.
For batched synchronization, you would need to query both storage accounts and list blobs in each blob container. In case the blob is available in just one storage account and not other, then you could simply create the blob in destination storage account by initiating async copy blob. Things would become trickier if the blob (by the same name) is present in both storage accounts. In this case you would need to define some rules (like comparing modified date etc.) to decide whether the blob should be copied from source to destination storage account.
For scheduling the batch synchronization, you could make use of Windows Azure Scheduler Service
. Even with this service, you would need to write code for synchronization logic. Scheduler service will only take care of scheduling part. It won't do the actual synchronization.
I would recommend making use of a worker role to implement synchronization logic. Another alternative is Web Jobs
which are announced recently though I don't know much about it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With