Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Deleting hundreds of entities from Azure Table Storage

My coworker and I are currenty implementing a undo/redo feature for products. We're using the command pattern and saving every command (e.g. create component 'myFancyComponent') as a TableStorageEntry in the table 'ProductHistory'. PartitionKey in this table is the unique product name, RowKey is a unique identifier.

Problem is if a product gets deleted, it should also delete the whole history in the table storage. Using batch delete could get quite complicated, because there are several limitations: max. 100 entities or max. 4 MB (whatever is the lesser). Are there any best practices for this problem? Or is the only solution to query all entries and checking the size of the entries and than batch delete the right amount of entries?

I also found this similiar question and considered to create a table for EACH product. It seems that there are some advantages of this approach:

  • Selecting should be faster, because only one table will be queried
  • Deleting is easy, just have to drop the whole table

Only disadvantage I found (in the 'WINDOWS AZURE TABLE' - white paper):

  • Note that, when a table is deleted, the same table name cannot be recreated for at least 30 seconds, while the table is being garbage collected.

Are there any other (performance) issues, if I create several hundreds tables?

like image 213
Robar Avatar asked Feb 22 '23 00:02

Robar


1 Answers

I'd strongly recommend the single table per product approach, as you pointed out it will quickly eliminate the need to batch up the transactions for delete and if you want to removethe 30 second limit, you just need to have an index table that stores the relationship between tables and their products. Then you can flag a product as deleted and reassign its table when its re-added. Thus allowing a simple by-pass of the 30 second limitation. Another approach (which I'd also recommend) is to decouple the addition of products so that is happens asyncronously, thus allow you to simply wait to create the table when the name is available again.

As for performance issues, there should be no issue even if you have thousands of tables providing you don't exceed the current 5k transactions per second limit on the Azure Storage Account. :)

like image 193
BrentDaCodeMonkey Avatar answered Mar 03 '23 09:03

BrentDaCodeMonkey