Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Duplicate detection in Azure Storage Queue

I want to know if there is any elegant way to ensure that Queue always have distinct messages (nothing related to Duplicate Detection Window or any time period for that matter) ?

I know that Service Bus Queue provides session concepts (as I mentioned Duplicate Detection of Service Bus Queue won't help me as it depends on time period), which can serve my purpose, but I don't want my component's dependency on another Azure service, just because of this feature.

Thanks,

like image 594
Gaurav Tiwari Avatar asked Jan 11 '13 14:01

Gaurav Tiwari


People also ask

How do I monitor my Azure Storage queue?

You can analyze metrics for Azure Storage with metrics from other Azure services by using Azure Metrics Explorer. Open Metrics Explorer by choosing Metrics from the Azure Monitor menu. For details on using this tool, see Getting started with Azure Metrics Explorer.

What is duplicate detection in Azure Service Bus?

When you enable duplicate detection for a queue or topic, Azure Service Bus keeps a history of all messages sent to the queue or topic for a configure amount of time. During that interval, your queue or topic won't store any duplicate messages.

What are the advantages of Azure queue Storage?

Storage queues are part of the Azure Storage infrastructure. They allow you to store large numbers of messages. You access messages from anywhere in the world via authenticated calls using HTTP or HTTPS. A queue message can be up to 64 KB in size.

What is the difference between Azure Storage queue and Service Bus queue?

Azure Queues provide a uniform and consistent programming model across queues, tables, and BLOBs – both for developers and for operations teams. Service Bus queues provide support for local transactions in the context of a single queue.


1 Answers

This is not possible to do reliably.

There is just no mechanism that can query a Storage queue and find out if a message with the same contents is already there or was there before. You can try to implement your own logic using some storage table, but that will not be reliable - as the entry into the table may succeed and then entry into the queue may fail - and now you would potentially have bad data in the table.

Your code should always assume that it can retrieve a message containing the same data that was already processed. This is because messages can come back to the queue when workers that are working on them crash or take too long.

like image 156
Igorek Avatar answered Sep 27 '22 19:09

Igorek