I am a total newbie with Azure! The purpose is to return the rows based on the timestamp stored in the RowKey. As there is a transaction cost with each query, I want to minimize the number of transactions/queries whilst maintain performance
These are the proposed Partition and Row Keys:
Legend:
I would like to get back from a single query the rows and any childrows that is > or < DateOfMessage_MessageId
Can this be done via my proposed PartitionKeys and RowKeys?
ie.. (in psuedo code)
var results = ctx.PartitionKey.StartsWith(TextCache_AccountId)
&& ctx.RowKey > (TimeStamp)_MessageId
Secondly, if there I have a number of accounts, and only want to return back the first 10, could it be done via a single query
ie.. (in psuedo code)
var results = (
(
ctx.PartitionKey.StartsWith(TextCache_(AccountId1)) &&
&& ctx.RowKey > (TimeStamp1)_MessageId1 )
)
||
(
ctx.PartitionKey.StartsWith(TextCache_(AccountId2)) &&
&& ctx.RowKey > (TimeStamp2)_MessageId2 )
) ...
)
.Take(10)
The row key is a unique identifier for an entity within a given partition. Together the PartitionKey and RowKey uniquely identify every entity within a table. The row key is a string value that may be up to 1 KiB in size. You must include the RowKey property in every insert, update, and delete operation.
How should you choose a good partition key for a Table storage implementation? (Choose all that apply.) They should always be unique, like a primary key in a SQL table. You should always use the same partition key for all records. Think about how you're likely to update the data using batch transactions.
You can write queries against the Table service using LINQ syntax.
Enter an Account Name, Account Key, and Table Name on the Azure Table tab of the New Session dialog. Select either HTTP or HTTPS as the connection Protocol. Ensure that the Analysis Grid viewer is selected in the Start With drop-down list. Start retrieving data by clicking the Start button in the New Session dialog.
The short answer to your questions is yes, but there are some things you need to watch for.
Azure table storage doesn't have a direct equivalent of .StartsWith()
. If you're using the storage library in combination with LINQ you can use .CompareTo()
(> and < don't translate properly) which will mean that if you run a search for account 1 and you ask the query to return 1000 results, but there are only 600 results for account 1, the last 400 results will be for account 10 (the next account number lexically). So you'll need to be a bit smart about how you deal with your results.
If you padded out the account id with leading 0s you could do something like this (pseudo code here as well)
ctx.PartionKey > "TextCache_0000000001"
&& ctx.PartitionKey < "TextCache_0000000002"
&& ctx.RowKey > "123465798"
Something else to bear in mind is that queries to Azure Tables return their results in PartitionKey
then RowKey
order. So in your case messages without a ParentMessageId
will be returned before messages with a ParentMessageId
. If you're never going to query this table by ParentMessageId
I'd move this to a property.
If TextCache_
is just a string constant, it's not adding anything by being included in the PartitionKey
unless this will actually mean something to your code when it's returned.
While you're second query will run, I don't think it will produce what you're after. If you want the first ten rows in DateOfMessage
order, then it won't work (see my point above about sort orders). If you ran this query as it is and account 1 had 11 messages it will return only the first 10 messages related to account 1 regardless if whether account 2 had an earlier message.
While trying to minimise the number of transactions you use is good practice, don't be too concerned about it. The cost of running your worker/web roles will dwarf your transaction costs. 1,000,000 transactions will cost you $1 which is less than the cost of running one small instance for 9 hours.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With