Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Azure Blob storage: DownloadToByteArray VS DownloadToStream

Tags:

c#

azure

I have been playing with the Azure Blob Storage service to save/recover files in a context of a web page to be hosted in Azure Web Pages.

During the learning process I have come with two solutions; the first basically uses DownloadToStream which does the same but with a FileStream. In this case I have to write the file in the server prior to return it to the user.

public static Stream GetFileContent(string fileName, HttpContextBase context)
{
      CloudBlobContainer container = GetBlobContainer();    
      CloudBlockBlob blockBlob = container.GetBlockBlobReference(fileName);                                       
      Stream fileStream = new FileStream(
          context.Server.MapPath("~/App_Data/files/" + fileName), FileMode.Create);   
      blockBlob.DownloadToStream(fileStream);
      fileStream.Close();    
      return File.OpenRead(context.Server.MapPath("~/App_Data/files/" + fileName));
}

public ActionResult Download(string fileName)
{
    byte[] fileContent = MyFileContext.GetFileContent(fileName);
    return File(fileContent, "application/zip", fileName);        
}

On the other hand I used the DownloadToByteArray function with writes the content of the Blob in an array of bytes initialized with the size of the Blob file.

public static byte[] GetFileContent(string fileName)
{
    CloudBlobContainer container = GetBlobContainer();           
    CloudBlockBlob blockBlob = container.GetBlockBlobReference(fileName);
    blockBlob.FetchAttributes();
    long fileByteLength = blockBlob.Properties.Length;
    byte[] fileContent = new byte[fileByteLength];
    for (int i = 0; i < fileByteLength; i++)
    {
        fileContent[i] = 0x20;
    }
    blockBlob.DownloadToByteArray(fileContent,0);
    return fileContent;
}

public ActionResult Download(string fileName)
{   
   byte[] fileContent = MyFileContext.GetFileStream(fileName);
   return File(fileContent, "application/zip", fileName);
}

When I look at both options I see the first needs to create a file in the server's disk whereas the second stores the data from the Blob in a byte array consuming memory. In my particular case I am going to handle file sizes of ~150 MB.

Given the circumstances (environment, file sizes...) which approach do you think is best?

like image 351
Julen Avatar asked Jun 19 '14 17:06

Julen


People also ask

What is the main difference between GPv1 & GPv2 in Azure storage account?

GPv2: Basic storage account type for blobs, files, queues, and tables. Use GPv2 for most scenarios using Azure Storage. GPv1: Legacy account type for blobs, files, queues, and tables.

What is the difference between Azure storage and Blob storage?

In summary, the difference between the two storage services is that Azure Blob Storage is a store for objects capable of storing large amounts of unstructured data. On the other hand, Azure File Storage is a distributed, cloud-based file system.

What is difference between Azure Blob and azure Datalake?

Azure Blob Storage is a general purpose, scalable object store that is designed for a wide variety of storage scenarios. Azure Data Lake Storage Gen1 is a hyper-scale repository that is optimized for big data analytics workloads. Based on shared secrets - Account Access Keys and Shared Access Signature Keys.


3 Answers

Instead of streaming the blob through your server, you could download it directly from the blob storage. My answer is built on top of Steve's response here: Downloading Azure Blob files in MVC3. For downloading a blob directly from the storage, you would utilize Shared Access Signature (SAS). Recently Azure Storage has introduced an enhancement, which allows you to specify Content-Disposition header in SAS. See this modified code.

    public static string GetDownloadLink(string fileName)
    {
        CloudBlobContainer container = GetBlobContainer();
        CloudBlockBlob blockBlob = container.GetBlockBlobReference(fileName);
        //Create an ad-hoc Shared Access Policy with read permissions which will expire in 12 hours
        SharedAccessBlobPolicy policy = new SharedAccessBlobPolicy()
        {
            Permissions = SharedAccessBlobPermissions.Read,
            SharedAccessExpiryTime = DateTime.UtcNow.AddHours(12),
        };
        //Set content-disposition header for force download
        SharedAccessBlobHeaders headers = new SharedAccessBlobHeaders()
        {
            ContentDisposition = string.Format("attachment;filename=\"{0}\"", fileName),
        };
        var sasToken = blockBlob.GetSharedAccessSignature(policy, headers);
        return blockBlob.Uri.AbsoluteUri + sasToken;
    }

    public ActionResult Download(string fileName)
    {
        var sasUrl = GetDownloadLink(fileName);
        //Redirect to SAS URL ... file will now be downloaded directly from blob storage.
        Redirect(sasUrl);
    }
like image 78
Gaurav Mantri Avatar answered Oct 05 '22 19:10

Gaurav Mantri


The benefit of Stream is that you can deal with bits piece-by-piece as they are downloaded rather than building up a big byte[] and then operating on the full thing. Your use of Stream isn't really getting the benefits since you are writing to a file and then reading that full file into memory. A good use of the stream API would be to pipe the download stream directly to the request's response stream as shown in the answer here Downloading Azure Blob files in MVC3

like image 30
Robert Levy Avatar answered Oct 05 '22 21:10

Robert Levy


If you are planning to use the DownloadToBytesArray (async or not), you will have to fetch blob attributes first to get an initial size of byte array.

And if you will be using DownloadToStream you will not have to do that. That's one saved HTTP call to the blob storage and if I am not mistaken, FetchAttributes() is executed as HTTP HEAD request and that will count as a normal transaction (it will cost you some money in other words).

like image 23
Zygimantas Avatar answered Oct 05 '22 21:10

Zygimantas