We currently have a blob storage with thousands of files under the same Azure container. Our file naming convention is something like this:
StorageName\Team\SubTeam\FileName
I'm writing a tool that displays the files for each particular subteam. The code gets the list of blobs for the Container and then for each of those it tries to match to the correct Team\Subteam (see below for sample code).
This works but is extremely slow (because I need to go through all the files to see if they match a particular subteam). Is there some way to improve the speed of the query? I can think of optimizations such as "Find the first file that matches the team you are looking for and then keep track when you find a different team to quit the for early" but that would assume that the BlobList is sorted and wouldn't fix the worst case scenario.
Unfortunately splitting the files under different containers is not an option at this time.
Here is sample code:
IEnumerable<IListBlobItem> blobs = blobContainer.ListBlobs(
new BlobRequestOptions()
{
UseFlatBlobListing = true,
BlobListingDetails = BlobListingDetails.Metadata
}).OfType<CloudBlob>();
foreach (var blob in blobs) {
var cloudy = blob as CloudBlob;
string blobTeamId = cloudy.Uri.Segments[2].Trim('/');
if (blobTeamId != teamId)
continue;
//Do something interesting with the file
1st Solution With the REST interface you can pass in
http://somwhere.com/mycontainername/?restype=container&comp=list&delimiter=/&prefix=\Team\SubTeam
and this will return an xml doc with only the files in the sub team "Folder" (I know its not a folder but it looks like one in the tools)
You might need to generate a shared access signature to be able to access it you have to tag this on the end of the URL.
check out here
Where it shows that you can filter by blobname prefix.
2nd Solution This is probably closer to what you want. If you can use the new storage client that was updated in the azure sdk 1.3 then you can now use
IEnumerable blobList = client.ListBlobsWithPrefix("Team/SubTeam");
Where Client is an instance of CloudBlobClient.
EDIT - 18 Nov 2013 it looks like resttype is no longer supported as a parameter and it should be restype. This seems to have happened quietly over the weekend. I have changed the url example above.
Just an update...
You can use get a list of blobs by using GetDirectoryRefence and then list blobs...
var subDirectory = blobContainer.GetDirectoryReference(String.Format("{0}/", folder));
return subDirectory.ListBlobs(false, BlobListingDetails.Metadata);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With