In Azure table storage's batch save operation, is there an efficient way to replace certain properties on an entity if it already exists; but update all properties for new entities?
Here is the scenario that I am talking about.
I have an entity called Order
public class Order : TableEntity
{
public Order(String department, string orderId)
{
this.PartitionKey = department;
this.RowKey = orderId;
}
public DateTime CreatedOn {get; set;}
public String CreatedBy {get; set;}
public DateTime UpdatedOn {get; set;}
public String UpdatedBy {get; set;}
//Class contains other properties which could add up to 1MB
}
Scenario
Can I do it in a single step without reading contents from the Table Store?
Here is a one way to do it (very rough pseudo code)
function Upsert(Dictionary<String, Order> ordersInput)
{
//1. Read existing ordersInput from database
var existingOrders = Retrieve(ordersInput.Values);
//2. Update 'ordersInput' with existing data
foreach(var existingOrder in existingOrders)
{
if(ordersInput.ContainsKey(existingOrder.RowKey)
{
ordersInput[existingOrder.RowKey].CreatedOn = existingOrder.CreatedOn;
ordersInput[existingOrder.RowKey].CreatedBy = existingOrder.CreatedBy;
}
}
//Save all merged orders to Azure
SaveToAzure(existingOrders);
}
The issue I have with the above approach is that each order entity is 1 MB in size and reading all entities bogs down the API save operation.
Is there a more efficient way to perform the conditional merge entirely on azure?
I was also thinking about doing a batch insert in the following way
(The above approach sounds hacky and I think it might cause a bunch of concurrency issues)
Azure Table Storage natively support Upsert
operation via InsertOrReplace
and InsertOrMerge
functions. InsertOrReplace
will replace an entity completely with the new entity if it exists otherwise it will create a new one. InsertOrMerge
will change the properties of an existing entity which are present in the new entity if the entity with the same PartitionKey/RowKey exists otherwise it will create a new one.
UPDATE
Here's an alternate approach. A few things you can do:
PartitionKey
and RowKey
so that you can determine if an entity already exists or not. This would considerably reduce the response payload.CreatedOn
and CreatedBy
fields for existing entities, you would need to make them nullable especially CreatedOn
.Based on this, you would first fetch the existing entities. Retrieve operation would only return PartitionKey
and RowKey
(using query projection technique). Then you would loop through the ordersInput
and see if the entity exists. If the entity exists, then you will set CreatedOn
and CreatedBy
as null
and mark that entity to be Merged
. If the entity does not exist, you will set all the properties and mark that entity to be Inserted
. Then you will send this batch request to table service.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With