Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Downloading contents of a Azure blob as a text string taking too long time

I am developing an application that

  1. Upload a .CSV file on Azure blob storage from my local machine using simple HTTP web page (REST methods)

  2. Once, the .CSV file is uploaded, I fetch the stream in order to update my database

The .CSV file is around 30 MB, it takes 2 minutes to upload to blob, but takes 30 minutes to read the stream. can you please provide inputs to improve the speed? Here is the code snippet being used to read stream from the file: https://azure.microsoft.com/en-in/documentation/articles/storage-dotnet-how-to-use-blobs/

public string GetReadData(string filename)
        {
            // Retrieve storage account from connection string.
            CloudStorageAccount storageAccount = CloudStorageAccount.Parse(System.Web.Configuration.WebConfigurationManager.AppSettings["StorageConnectionString"]);

            // Create the blob client.
            CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();

            // Retrieve reference to a previously created container.
            CloudBlobContainer container = blobClient.GetContainerReference(System.Web.Configuration.WebConfigurationManager.AppSettings["BlobStorageContainerName"]);

            // Retrieve reference to a blob named "filename"
            CloudBlockBlob blockBlob2 = container.GetBlockBlobReference(filename);

            string text;
            using (var memoryStream = new MemoryStream())
            {
                blockBlob2.DownloadToStream(memoryStream);
                text = System.Text.Encoding.UTF8.GetString(memoryStream.ToArray());
            }

            return text;
        }
like image 573
Rohit Avatar asked Aug 13 '15 08:08

Rohit


People also ask

Can you text blob storage?

Blob storage is optimized for storing massive amounts of unstructured data. Unstructured data is data that doesn't adhere to a particular data model or definition, such as text or binary data.


1 Answers

To speed up the process, one thing you could do is instead of reading the entire file in one go you read them in chunks. Take a look at DownloadRangeToStream method.

Essentially the idea is that you first create an empty file of 30 MB (size of your blob). Then in parallel you download 1MB (or whatever size you see fit) chunks using DownloadRangeToStream method. As and when these chunks are downloaded, you put the stream contents in appropriate places in the file.

I answered a similar question on SO a few days ago: StorageException when downloading a large file over a slow network. Take a look at my answer there. There the chunks are downloaded in sequence but it should give you some idea about how to implement chunked download.

like image 129
Gaurav Mantri Avatar answered Oct 07 '22 01:10

Gaurav Mantri