I am developing an application that
Upload a .CSV file on Azure blob storage from my local machine using simple HTTP web page (REST methods)
Once, the .CSV file is uploaded, I fetch the stream in order to update my database
The .CSV file is around 30 MB, it takes 2 minutes to upload to blob, but takes 30 minutes to read the stream. can you please provide inputs to improve the speed? Here is the code snippet being used to read stream from the file: https://azure.microsoft.com/en-in/documentation/articles/storage-dotnet-how-to-use-blobs/
public string GetReadData(string filename)
{
// Retrieve storage account from connection string.
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(System.Web.Configuration.WebConfigurationManager.AppSettings["StorageConnectionString"]);
// Create the blob client.
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
// Retrieve reference to a previously created container.
CloudBlobContainer container = blobClient.GetContainerReference(System.Web.Configuration.WebConfigurationManager.AppSettings["BlobStorageContainerName"]);
// Retrieve reference to a blob named "filename"
CloudBlockBlob blockBlob2 = container.GetBlockBlobReference(filename);
string text;
using (var memoryStream = new MemoryStream())
{
blockBlob2.DownloadToStream(memoryStream);
text = System.Text.Encoding.UTF8.GetString(memoryStream.ToArray());
}
return text;
}
Blob storage is optimized for storing massive amounts of unstructured data. Unstructured data is data that doesn't adhere to a particular data model or definition, such as text or binary data.
To speed up the process, one thing you could do is instead of reading the entire file in one go you read them in chunks. Take a look at DownloadRangeToStream
method.
Essentially the idea is that you first create an empty file of 30 MB (size of your blob). Then in parallel you download 1MB (or whatever size you see fit) chunks using DownloadRangeToStream
method. As and when these chunks are downloaded, you put the stream contents in appropriate places in the file.
I answered a similar question on SO a few days ago: StorageException when downloading a large file over a slow network. Take a look at my answer there. There the chunks are downloaded in sequence but it should give you some idea about how to implement chunked download.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With