Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Azure Storage Table Paging

To implement simple paging in Azure Storage in relatively straight forward: Paging with Windows Azure Table Storage. This can be implemented with continuation token functionality.

But.

This is just a start for the serious paging. First problem is sorting. You can not do OrderBy in Azure Table. What would be the best solution to overcome this? Pages must be sorted, that's the fact.

Second problem, when come to the paging is to know number of total pages, with just continuation token functionality this is not possible. To do at every page ".Count()" seems to me very inefficient (since partitions could be on multiple servers, for instance).

Third problem is, related to the second, even you can count how many pages you have, how to "connect" counted pages to the actuals continuation tokens? This is the biggest mystery for me. How to get a continuation from the specific table row?

I would be very happy, if correct solution could be provided. I must admit I also have one and I will write it in one of the answers below.

like image 835
Peter Stegnar Avatar asked Jun 13 '11 08:06

Peter Stegnar


1 Answers

I know this doesn't solve your question in the way you asked for, but still, I do not believe paging should be performed in the way you suggested. What I mean by that is that, since Azure Table Storage does not support the functionallity you require, it may not be a good fit.

I would get the data in a local cache, perform the order and paging in there and be done with it. There is a suggested workaround for this limitation with carefully constructing the rowkey/partitionkey but I would strongly suggest you not follow that.

Blog blog=  new Blog();
// Note the fixed length of 19 being used since the max tick value is 19 digits long.
string rowKeyToUse = string.Format("{0:D19}", 
        DateTime.MaxValue.Ticks - DateTime.UtcNow.Ticks);
blog.RowKey = rowKeyToUse;

So a blog b1 dated 10/1/2008 10:00:00 AM will have 2521794455999999999 as the RowKey, and b2 dated 10/2/2008 10:00:00 AM will have 2521793591999999999 as the RowKey and hence b2 will preceede b1.

To retrieve all blogs dated after 10/1/2008 10:00:00 AM, we will use the follwing query:

     string rowKeyToUse = string.Format("{0:D19}", 
        DateTime.MaxValue.Ticks - DateTime.UtcNow.Ticks);
var blogs = 
    from blog in context.CreateQuery<Blog>("Blogs")
    where blog.PartitionKey == "Football" 
        && blog.RowKey.CompareTo(rowKeyToUse) > 0
  select blog;

(this has been taken from Windows Azure Table, Dec. 2008 Documents provided by Microsoft)

As for counting the number of pages, that's easy, a simply divide operation will do the trick here; as for continuation tokens, one way would be to (upon initial request) "walk" on each page and get the continuation token which basically just tells you which row & partition keys come next. But having all of them means you are vulnerable to consistency errors (e.g. if someone posts something into the same table storage).

Personally, I would page based on rowkeys, as I described above, or, if this is a requirement, move to a storage engine that supports it.

To elaborate a bit further, if you know you will have only one "OrderBy" clause, you can select all of them, and through some implication, guess what the page boundaries will be.

On a side note, I believe the paging provided is there not to allow paging on the front-end but to alliviate the 1000 result limit. But this are just my $0.02.

like image 180
Anže Vodovnik Avatar answered Oct 19 '22 02:10

Anže Vodovnik