Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use CloudBlob.ExistsAsync vs catch StorageException.BlobNotFound, in terms of performance?

I want to download a file from Azure Blob Storage which may not exist yet. And I'm looking for the most reliable and performant way into handling this. To this end, I've found two options that both work:

Option 1: Use ExistsAsync()
Duration: This takes roughly 1000~1100ms to complete.

if (await blockBlob.ExistsAsync())
{
    await blockBlob.DownloadToStreamAsync(ms);
    return ms;
}
else
{
    throw new FileNotFoundException();
}

Option 2: Catch the exception
Duration: This takes at least +1600ms, everytime.

try
{
    await blockBlob.DownloadToStreamAsync(ms);                
    return ms;      
}
catch (StorageException e)
{
    // note: there is no 'StorageErrorCodeStrings.BlobNotFound'
    if (e.RequestInformation.ErrorCode == "BlobNotFound")
        throw new FileNotFoundException();

    throw;
}

The metrics are done through simple API calls on a webapi, which consumes the above functions and returns an appropriate message. I've manually tested the end-to-end scenario here, through Postman. There is some overhead in this approach of course. But summarized, it seems the ExistsAsync() operation consequently saves at least 500ms. At least on my local machine, whilst debugging. Which is a bit remarkable, because the DoesServiceRequest attribute on ExistsAsync() seems to indicate it is another expensive http call that needs to be made.

Also, the ExistsAsync API docs don't say anything about it's usage or any side-effects.

A blunt conclusion, based on poor man's testing, would therefor lead me to option no. 1, because:

  • it's faster in debug/localhost (the catch; says nothing about compiled in prod)
  • to me it's more eloquent, especially because also the ErrorCode needs manual checking of a particular code
  • I would assume the ExistsAsync() operation is there for this exact reason

But here is my actual question: is this the correct interpration of the usage of ExistsAsync()?
E.g. is the "WHY" it exists = to be more efficiënt than simply catching a not found exception, particularly for performance reasons?

Thanks!

like image 627
Juliën Avatar asked Mar 06 '23 20:03

Juliën


1 Answers

But here is my actual question: is this the correct interpration of the usage of ExistsAsync()?

You can easily take a look at the implementation yourself.

ExistsAsync() is just a wrapper around an http call that throws an http not found if the blob is not there and return false in that case. True otherwise.

I'd say go for ExistsAsync as it seems the most optimal way, especially if you count on the fact that sometimes the blob is not there. DownloadToStreamAsync has more work to do in terms of wrapping the exception in a StorageException and maybe do some more cleanup.

I would assume the ExistsAsync() operation is there for this exact reason

Consider this: sometimes you just want to know if a given blob exists without being interested in the content. For example to give a warning that something will be overwritten when uploading. In that case using ExistsAsync is a nice use case because using DownloadToStreamAsync will be expensive for just a check on existence since it will download the content if the blob is there.

like image 93
Peter Bons Avatar answered Apr 28 '23 14:04

Peter Bons