Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strategy to minimize Azure storage outbound data costs

I am building a web site that (among other things) allows the user to upload photos via web api. The user images will be stored in azure storage blob to be displayed in user albums, and shared with social media. The site will be hosted as an azure web site. I am eager to minimize data transfer costs. I understand that data transfer between an azure web site and table/blob storage incurs no data transfer charge (as it is not considered "outbound") while and data requested from outside the azure web site does. In response to this, I have 2 strategies for exposing the images to the browser:

1.) Via the URI to the image blob in azure storage e.g. with local storage account http://ipv4.fiddler:10000/devstoreaccount1/bcb2ad7581.jpg

2.) Via web api that downloads the image bytes from storage and returns them. e.g. with local host http://localhost:58559/api/image/bcb2ad7581.jpg

These are my assumptions. The direct to storage access (method 1 above) is more efficient. Accessing the images via web api (method 2 above) must incur overheads that the direct access doesn't, right? Each web api request must consume an asp .net thread plus cpu cycles. For each web api image request processed, that is one less request for other web api resources on the site that cannot, and must be queued. On the other hand any external site the image is shared with would add a data transfer cost (among other costs) for each image request; if accessed via method 1.

So my strategy is to access the images within the site via a direct link to the storage (method 1) e.g. when the user opens an album all tags have azure blob uri in their src attribute. However when the user clicks on the Facebook icon to share, I will provide a link to the image via web api (method 2). I realise the user can bypass all of that with plugins like the "PinIt" button etc, but that's OK.

I am only learning this stuff, so I could be way off. Am I wrong about outbound transfer costs not being applied to azure web sites? I don't think I am but the whole pricing model is confusing, to say the least.

Is accessing blob storage from a browser html page with tag and src atribute, considered outbound data transfer; even if the html page comes from an azure website domain? I mean is it only free when the server side code accesses the storage, not the html client?

Is any data transfer cost saved via method 2 (if indeed there is one), simply cancelled out by a different cost associated with the web api method (like bandwith cost)?

Am I wrong about the performance benefit of direct access to the blob storage, or possibly wrong about the overhead of the web api requests?

It is early days in the design, so I can dump Azure if I have to. I would rather not though, as I think it is what I'm looking. I don't want something for nothing and am happy to pay for the services I consume. Naturally, though, I don't want my ignorance to cost me.

I could do with your advice, on this, and truly appreciate your help.

like image 458
Seamus Barrett Avatar asked Jan 22 '15 11:01

Seamus Barrett


1 Answers

To answer your questions:

Am I wrong about outbound transfer costs not being applied to azure web sites?

Sadly, Yes :) Any data that goes out of an Azure Datacenter (DC) incurs an outbound transfer cost and that includes data served through your websites.

Is accessing blob storage from a browser html page with tag and src atribute, considered outbound data transfer; even if the html page comes from an azure website domain? I mean is it only free when the server side code accesses the storage, not the html client?

Yes. Remember the browser is consuming the data which is sitting outside of Azure DC.

Is any data transfer cost saved via method 2 (if indeed there is one), simply cancelled out by a different cost associated with the web api method (like bandwidth cost)?

No. Because data eventually flows out of Azure DC (doesn't matter if it is via storage directly or via web api).

Am I wrong about the performance benefit of direct access to the blob storage, or possibly wrong about the overhead of the web api requests?

You will certainly get more performance benefit by providing direct access to the blob storage than transferring data through web api. Plus you will increase latency as well.

Solution Recommendation

For your application, may I recommend that you look at Shared Access Signature functionality offered by Azure Blob Storage. I believe this will significantly improve the performance of your application.

For uploads, you could create a SAS URL will upload permission and have your web application directly upload files in blob storage. That way the upload data won't be routed through your servers. I wrote some blog posts on the same which you may find useful:

http://gauravmantri.com/2013/02/16/uploading-large-files-in-windows-azure-blob-storage-using-shared-access-signature-html-and-javascript/

http://gauravmantri.com/2013/12/01/windows-azure-storage-and-cors-lets-have-some-fun/

For downloading images, again have your Web API return a SAS URL instead of reading the image data from blob storage and then stream that data back to the client browser.

like image 189
Gaurav Mantri Avatar answered Sep 18 '22 06:09

Gaurav Mantri