Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use integer RowKeys in Azure Table Storage?

I have consecutively numbered entities that I want to persist with the Azure Table Service, however the type of the RowKey column is problematic. The number of the entity should be stored in the RowKey column, so I can query entities fast (PK = '..' && RowKey = 5), get newest entities (RowKey > 10) and query a certain set of entities (RowKey > 5 && RowKey < 10).

Since RowKey must be a string, lower-than comparisons are problematic ("100" < "11"). I thought about prepending zeros to the numbers (so that "100" > "011"), but I can't predict the number of entities (and thus the number of zeros).

I know I could just create an integer column, but I would loose the performance advantage of the indexed RowKey column (plus I don't have any other information suitable for RowKey). Did anyone have this problem before?

like image 788
enzi Avatar asked Mar 07 '11 10:03

enzi


People also ask

How do I add data to Azure Table storage?

Use an account key The following properties are supported. The type property must be set to AzureTableStorage. Specify the information needed to connect to Storage for the connectionString property. You can also put account key in Azure Key Vault and pull the accountKey configuration out of the connection string.

What is PartitionKey and RowKey in Azure Table?

The row key is a unique identifier for an entity within a given partition. Together the PartitionKey and RowKey uniquely identify every entity within a table. The row key is a string value that may be up to 1 KiB in size. You must include the RowKey property in every insert, update, and delete operation.


2 Answers

I had a similar problem, with the added caveat that I also wanted to support having the RowKey sorted in descending order. In my case I did not care about supporting trillions of possible values because I was correctly using the PartitionKey and also using scoping prefixes when needed to further segment the RowKey (like "scope-id" -> "12-8374").

In the end I settled on an specific implementation of the general approach suggested by enzi. I used a modified version of Base64 encoding, producing a four character string, which supports over 16 million values and can be sorted in ascending or descending order. Here is the code, which has been unit tested but lacks range checking/validation.

/// <summary>
/// Gets the four character string representation of the specified integer id.
/// </summary>
/// <param name="number">The number to convert</param>
/// <param name="ascending">Indicates whether the encoded number will be sorted ascending or descending</param>
/// <returns>The encoded string representation of the number</returns>
public static string NumberToId(int number, bool ascending = true)
{
    if (!ascending)
        number = 16777215 - number;

    return new string(new[] { 
        SixBitToChar((byte)((number & 16515072) >> 18)), 
        SixBitToChar((byte)((number & 258048) >> 12)), 
        SixBitToChar((byte)((number & 4032) >> 6)), 
        SixBitToChar((byte)(number & 63)) });
}

/// <summary>
/// Gets the numeric identifier represented by the encoded string.
/// </summary>
/// <param name="id">The encoded string to convert</param>
/// <param name="ascending">Indicates whether the encoded number is sorted ascending or descending</param>
/// <returns>The decoded integer id</returns>
public static int IdToNumber(string id, bool ascending = true)
{
    var number = ((int)CharToSixBit(id[0]) << 18) | ((int)CharToSixBit(id[1]) << 12) | ((int)CharToSixBit(id[2]) << 6) | (int)CharToSixBit(id[3]);

    return ascending ? number : -1 * (number - 16777215);
}

/// <summary>
/// Converts the specified byte (representing 6 bits) to the correct character representation.
/// </summary>
/// <param name="b">The bits to convert</param>
/// <returns>The encoded character value</returns>
[MethodImplAttribute(MethodImplOptions.AggressiveInlining)] 
static char SixBitToChar(byte b)
{
    if (b == 0)
        return '!';
    if (b == 1)
        return '$';
    if (b < 12)
        return (char)((int)b - 2 + (int)'0');
    if (b < 38)
        return (char)((int)b - 12 + (int)'A');
    return (char)((int)b - 38 + (int)'a');
}

/// <summary>
/// Coverts the specified encoded character into the corresponding bit representation.
/// </summary>
/// <param name="c">The encoded character to convert</param>
/// <returns>The bit representation of the character</returns>
[MethodImplAttribute(MethodImplOptions.AggressiveInlining)] 
static byte CharToSixBit(char c)
{
    if (c == '!')
        return 0;
    if (c == '$')
        return 1;
    if (c <= '9')
        return (byte)((int)c - (int)'0' + 2);
    if (c <= 'Z')
        return (byte)((int)c - (int)'A' + 12);
    return (byte)((int)c - (int)'a' + 38);
}

You can just pass false to the ascending parameter to ensure the encoded value will sort in the opposite direction. I selected ! and $ to complete the Base64 set since they are valid for RowKey values. This algorithm can be easily amended to support additional characters, though I firmly believe that larger numbers do not make sense for RowKey values as table storage keys must be efficiently segmented. Here are some examples of output:

0 -> !!!! asc & zzzz desc

1000 -> !!Dc asc & zzkL desc

2000 -> !!TE asc & zzUj desc

3000 -> !!is asc & zzF5 desc

4000 -> !!yU asc & zz$T desc

5000 -> !$C6 asc & zylr desc

6000 -> !$Rk asc & zyWD desc

7000 -> !$hM asc & zyGb desc

8000 -> !$x! asc & zy0z desc

9000 -> !0Ac asc & zxnL desc

like image 114
Jamie Thomas Avatar answered Oct 10 '22 19:10

Jamie Thomas


I found an easy way but the previous solution is more efficient (regarding key length). Instead of using all alphabets we can use just the numbers and the key is to make the length fixed (0000,0001,0002,.....):

public class ReadingEntity : TableEntity
{
    public static string KeyLength = "000000000000000000000";
    public ReadingEntity(string partitionId, int keyId)
    {
        this.PartitionKey = partitionId;
        this.RowKey = keyId.ToString(KeyLength); ;


    }
    public ReadingEntity()
    {
    }
}


public IList<ReadingEntity> Get(string partitionName,int date,int enddate)
{
        CloudTableClient tableClient = storageAccount.CreateCloudTableClient();

        // Create the CloudTable object that represents the "people" table.
        CloudTable table = tableClient.GetTableReference("Record");

        // Construct the query operation for all customer entities where PartitionKey="Smith".
        TableQuery<ReadingEntity> query = new TableQuery<ReadingEntity>().Where(TableQuery.CombineFilters(
    TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, partitionName),
    TableOperators.And,TableQuery.CombineFilters(
    TableQuery.GenerateFilterCondition("RowKey", QueryComparisons.LessThan, enddate.ToString(ReadingEntity.KeyLength)), TableOperators.And,
    TableQuery.GenerateFilterCondition("RowKey", QueryComparisons.GreaterThanOrEqual, date.ToString(ReadingEntity.KeyLength)))));
        return table.ExecuteQuery(query).ToList();
}

Hope this helps.

like image 30
Mais Alragheb Avatar answered Oct 10 '22 19:10

Mais Alragheb