Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS SQS Dead Letter Queue notifications

I'm trying to design a small message processing system based on SQS, Lambda, and SNS. In case of failure, I'd like for the message to be enqueued in a Dead Letter Queue (DLQ) and for a webhook to be called.

I'd like to know what the most canonical or reasonable way of achieving that would look like.

Currently, if everything goes well, the process should be as follows:

  1. SQS (in place to handle retries) enqueues a message
  2. Lambda gets invoked by SQS and processes the message
  3. Lambda sends a webhook and finishes normally

If something in the lambda goes wrong (success webhook cannot be called, task at hand cannot be processed), the easiest way to achieve what I want seems to be to set up a DLQ1 that SQS would put the failed messages in. An auxiliary lambda would then be called to process this message, pass it to SNS, which would call the failure webhook, and also forward the message to DLQ2, the final/true DLQ.

Is that the best approach?

One alternative I know of is Alarms, though I've been warned that they are quite tricky. Another one would be to have lambda call the error reporting webhook if there's a failure on the last retry, although that somehow seems inappropriate.

Thanks!

like image 334
Jan Benes Avatar asked Apr 15 '19 10:04

Jan Benes


People also ask

How do I monitor a dead-letter queue?

How can I monitor and log dead-letter queues? You can use Amazon CloudWatch metrics to monitor dead-letter queues associated with your Amazon SNS subscriptions. All Amazon SQS queues emit CloudWatch metrics at one-minute intervals.

How do you test a SQS dead-letter queue?

If want to test the process to send messages to DLQs, you need to force in your tests an error on the queue messages' processing to send a message to the DLQ queue, this will be the best way to understand if the errors are going to the queue correctly.

What is AWS SQS dead-letter queue used for?

Dead-letter queues are an existing feature of Amazon SQS that allows customers to store messages that applications could not successfully consume. You can now efficiently redrive messages from your dead-letter queue to your source queue on the Amazon SQS console.

What do you do with messages in a dead-letter queue?

Every queue manager in a network typically has a local queue to be used as a dead-letter queue so that messages that cannot be delivered to their correct destination can be stored for later retrieval. Messages can be put on the DLQ by queue managers, message channel agents (MCAs), and applications.


1 Answers

Your architecture looks good enough in case of success, but I personally find it quite confusing if anything goes wrong as I don't see why you need two DLQs to begin with.

Here's what I would do in case of failure:

  1. Define a DLQ on your source SQS Queue and set the maxReceiveCount to e.g. 3, meaning if messages fail three times, they will be redirected to the configured DLQ
  2. Create a Lambda that listens to this DLQ.
  3. Execute the webhook inside this Lambda.
  4. Since step 3 automatically deletes the message from the Queue once it has been processed and, apparently, you want the messages to be persisted somewhere, store the content of the message in a file on S3 and store the file metadata (bucket and key) in a table in DynamoDB, so you can always query for failed messages.

I don't see any role for SNS here unless you want multiple subscribers for a given message, but as I see this is not the case.

This way, you need need to maintain only one DLQ and you can get rid of SNS as it's only adding an extra layer of complexity to your architecture.

like image 196
Thales Minussi Avatar answered Sep 18 '22 02:09

Thales Minussi