Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use the same PartitionKey and RowKey

I know it is working but I will like to know it this is a good practice of having the same string as PartitionKey and RowKey?

Thi scenario is for one single table where all items are unique, Customer table where every row has info about one single customer.

What I mean is that for example I will get this unique customer ID and I want to use it to get the record by PartitionKey + RowKey so the return will be fast and one single item.

What do you think?

like image 669
user2818430 Avatar asked Oct 29 '13 23:10

user2818430


People also ask

What is PartitionKey and RowKey?

The row key is a unique identifier for an entity within a given partition. Together the PartitionKey and RowKey uniquely identify every entity within a table. The row key is a string value that may be up to 1 KiB in size. You must include the RowKey property in every insert, update, and delete operation.

How should you choose a good partition key for Table storage implementation?

To recap, a good partition key must meet the following requirements. Ensure it has enough cardinality to populate all the nodes of the deployment as we scale up. Ensure it is static, which means items will need to be reshuffled to another partition once inserted.

Which of the following is a type of key used by Azure Table storage?

Table primary key. The primary key for an Azure entity consists of the combined PartitionKey and RowKey properties. The two properties form a single clustered index within the table. The PartitionKey and RowKey properties can store up to 1 KiB of string values.

How is the price calculated for the Table API?

How is the price calculated for the API for Table? The price depends on the allocated TableThroughput.


2 Answers

This will certainly make your customer look up quick. The RowKey can be an empty string so you technically don't have to make PartitionKey and Rowkey match if you will have a unique partition for every customer.

A couple of things to note here:

  • You're giving up adding customers in batch or updating them in batch. Since only entities in the same partition can be worked with in batch, if you have a single entity partition scheme there will be no batches. Given what you've outlined above I don't think this will bother you.
  • Any sort of range query against the partitionKey, such as all customers between 1 and 200, will end up possibly spanning multiple partition servers making this a very inefficient query. Again, if you are only going to look a customer up one at a time and never in groups you should be fine. Might want to think about that scenario where you have to go add a property to EVERY customer in your system and how you would handle that if it became necessary (a multi-threaded updater with a set of known customer IDs may be just fine, but you should at least think about it).
  • Try avoid an append only pattern. Meaning if your customer IDs are consecutive then as you add them they will initially be on the same partition server. Only after a segment of them get hot will they be moved off to another server. It's better to do a hash of the ID and use that as the PartitionKey which will cause them to be scattered more across multiple partition servers if you start really hammering on them. You may not actually see that depending on your load.

Check out the How to get most out of Windows Azure Tables article on choosing partition keys. You'll see most of what I said here is there as well (one of the places I learned it from) plus more.

like image 131
MikeWo Avatar answered Sep 27 '22 19:09

MikeWo


Using a consistent string ID, "0" as your RowKey has the same uniqueness outcome as double PK. PK+0 = PK+PK.

A practical solution is considering the most common query process. You might use the zip/pocode within the PartitionKey -- and then the customer GUID in the RowKey. If your customer base is evenly spread over the country. PartitionKey doesn't necessitate PrimaryKey...

like image 35
Gabe Rainbow Avatar answered Sep 27 '22 18:09

Gabe Rainbow