Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to batch retrieve entities?

In Azure table storage, how can I query for a set of entities that match specific row keys in a partition???

I'm using Azure table storage and need to retrieve a set of entities that match a set of row keys within the partition.

Basically if this were SQL it may look something like this:

SELECT TOP 1 SomeKey
FROM TableName WHERE SomeKey IN (1, 2, 3, 4, 5);

I figured to save on costs and reduce doing a bunch of table retrieve operations that I could just do it using a table batch operation. For some reason I'm getting an exception that says:

"A batch transaction with a retrieve operation cannot contain any other operations"

Here is my code:

public async Task<IList<GalleryPhoto>> GetDomainEntitiesAsync(int someId, IList<Guid> entityIds)
{
    try
    {
        var client = _storageAccount.CreateCloudTableClient();
        var table = client.GetTableReference("SomeTable");
        var batchOperation = new TableBatchOperation();
        var counter = 0;
        var myDomainEntities = new List<MyDomainEntity>();

        foreach (var id in entityIds)
        {
            if (counter < 100)
            {
                batchOperation.Add(TableOperation.Retrieve<MyDomainEntityTableEntity>(someId.ToString(CultureInfo.InvariantCulture), id.ToString()));
                ++counter;
            }
            else
            {
                var batchResults = await table.ExecuteBatchAsync(batchOperation);
                var batchResultEntities = batchResults.Select(o => ((MyDomainEntityTableEntity)o.Result).ToMyDomainEntity()).ToList();
                myDomainEntities .AddRange(batchResultEntities );
                batchOperation.Clear();
                counter = 0;
            }
        }

        return myDomainEntities;
    }
    catch (Exception ex)
    {
        _logger.Error(ex);
        throw;
    }
}

How can I achieve what I'm after without manually looping through the set of row keys and doing an individual Retrieve table operation for each one? I don't want to incur the cost associated with doing this since I could have hundreds of row keys that I want to filter on.

like image 826
spoof3r Avatar asked Jan 02 '16 05:01

spoof3r


2 Answers

I made a helper method to do it in a single request per partition.

Use it like this:

var items = table.RetrieveMany<MyDomainEntity>(partitionKey, nameof(TableEntity.RowKey), 
     rowKeysList, columnsToSelect);

Here's the helper methods:

    public static List<T> RetrieveMany<T>(this CloudTable table, string partitionKey, 
        string propertyName, IEnumerable<string> valuesRange, 
        List<string> columnsToSelect = null)
        where T : TableEntity, new()
    {
        var enitites = table.ExecuteQuery(new TableQuery<T>()
            .Where(TableQuery.CombineFilters(
                TableQuery.GenerateFilterCondition(
                    nameof(TableEntity.PartitionKey),
                    QueryComparisons.Equal,
                    partitionKey),
                TableOperators.And,
                GenerateIsInRangeFilter(
                    propertyName,
                    valuesRange)
            ))
            .Select(columnsToSelect))
            .ToList();
        return enitites;
    }


    public static string GenerateIsInRangeFilter(string propertyName, 
         IEnumerable<string> valuesRange)
    {
        string finalFilter = valuesRange.NotNull(nameof(valuesRange))
            .Distinct()
            .Aggregate((string)null, (filterSeed, value) =>
            {
                string equalsFilter = TableQuery.GenerateFilterCondition(
                    propertyName,
                    QueryComparisons.Equal,
                    value);
                return filterSeed == null ?
                    equalsFilter :
                    TableQuery.CombineFilters(filterSeed,
                                              TableOperators.Or,
                                              equalsFilter);
            });
        return finalFilter ?? "";
    }

I have tested it for less than 100 values in rowKeysList, however, if it even throws an exception if there are more, we can always split the request into parts.

like image 152
Artemious Avatar answered Oct 19 '22 10:10

Artemious


With hundreds of row keys, that rules out using $filter with a list of row keys (which would result in partial partition scan anyway).

With the error you're getting, it seems like the batch contains both queries and other types of operations (which isn't permitted). I don't see why you're getting that error, from your code snippet.

Your only other option is to execute individual queries. You can do these asynchronously though, so you wouldn't have to wait for each to return. Table storage provides upwards of 2,000 transactions / sec on a given partition, so it's a viable solution.

like image 39
David Makogon Avatar answered Oct 19 '22 12:10

David Makogon