Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple threading - Task.WhenAll when download blob storage files

I have the following code that downloads blob file to a string. It works fine but just very poor performance. It takes about 50 seconds to process 500 files. `

           try
            {
                var sourceClient = new BlobServiceClient(storageConnectionString);
                var foundItems = sourceClient.FindBlobsByTags("Client = 'TEST'").ToList();

                foreach (var blob in foundItems)
                {
                    var blobClient = blobContainer.GetBlockBlobClient(blob.BlobName);
                    BlobDownloadResult download = await blobClient.DownloadContentAsync();
                    string downloadedData = download.Content.ToString();
                    myList.Add(downloadedData);
                }
            }
            catch (Exception ex)
            {
                Console.WriteLine($"Exception: {ex.Message}");`
            }
    `   
        

I tried with multi threads for the code but it still takes about 25 seconds to process 500 files.

           var semaphore = new SemaphoreSlim(50);
           var tasks = new List<Task>();
            try
            {
                var sourceClient = new BlobServiceClient(storageConnectionString);
                var foundItems = sourceClient.FindBlobsByTags("Client = 'TEST'").ToList();

                foreach (var blob in foundItems)
                {
                    tasks.Add(Task.Run(async () =>
                    {
                        try
                        {
                            await semaphore.WaitAsync();
                            var blobClient = blobContainer.GetBlockBlobClient(blob.BlobName);
                            BlobDownloadResult download = await blobClient.DownloadContentAsync();
                            string downloadedData = download.Content.ToString();
                            myList.Add(downloadedData);
;
                        }
                        finally
                        {
                            semaphore.Release();
                        }
                    }));
                }
                await Task.WhenAll(tasks);
            }
            catch (Exception ex)
            {
                Console.WriteLine($"Exception: {ex.Message}");
            }

I'm pretty new to C#, am I doing anything wrong with multi-threading? what's the fastest way to read file from blob storage?

Note: the following line of code causes the most delay.

BlobDownloadResult download = await blobClient.DownloadContentAsync();
like image 475
Blue Avatar asked Dec 05 '25 01:12

Blue


1 Answers

Two biggest performance problems with your code are:

  • Don't wrap that download task in Task.Run, you're just using thread pool threads for no reason.
  • Stop switching contexts for no reason, use .ConfigureAwait(false) on your await calls.

A third problem, minor in comparison:

  • You're corrupting your memory by pushing to a List<> from multiple threads. Use a proper concurrent container, like ConcurrentBag<>. Edit: in fact I'm not even convinced you need the list, use the return value of Task.WhenAll to gather the results instead of doing result gathering by hand.
like image 155
Blindy Avatar answered Dec 06 '25 15:12

Blindy



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!