AWS Beanstalk: Exponential backoff for SQS?

Question

We are using the worker tier on Beanstalk to send out webhooks. We need to use exponential backoff in case any error when contacting the third party. However, it is unclear to me how this would work.

If the job fails and I invoke a ChangeMessageVisibility to some increasing time backoff time I have two choices:

Return a success 200. Then SQS will remove it from the queue - not good.
Return an error code. Then SQS will override the message visibility to the default value?

From Environment Tiers - AWS Beanstalk:

A web application in a worker environment tier should only listen on the local host. When the web application in the worker environment tier returns a 200 OK response to acknowledge that it has received and successfully processed the request, the daemon sends a DeleteMessage call to the SQS queue so that the message will be deleted from the queue. (SQS automatically deletes messages that have been in a queue for longer than the configured RetentionPeriod.) If the application returns any response other than 200 OK, then Elastic Beanstalk waits to put the message back in the queue after the configured VisibilityTimeout period. If there is no response, then Elastic Beanstalk waits to put the message back in the queue after the InactivityTimeout period so that the message is available for another attempt at processing.

Elliot Chance · Accepted Answer

ChangeMessageVisibility has a limit of 12 hours and only applies to inflight jobs (jobs that while they are running you want to notify SQS "I need more time to complete this").

The only solution is to create a new job in the queue with the same details and an additional counter for retries (in the message or as an attribute) and use the DelaySeconds with an exponential backoff based on retries + 1.

Unfortunately DelaySeconds has a limit of 15 minutes (900 seconds) so for you to schedule a job longer than that you have a few options:

Keep rescheduling the job every 15 minutes but don't cary out the task until the retries get high enough. This would run 95 jobs that do nothing until the 96th. This could generate a colossal amount of dummy jobs.
Put the job somewhere else (like a database or cache) an use a cron or some other scheduled process to put it back in the queue once a minimum timestamp is reached. The timestamp would be now + 1 day for example.

metakungfu · Answer

There are pros and cons to increasing the ChangeMessageVisibility of a failing job:

Pros:

you wont loose the job in the process of removing it & requeuing it.

Cons:

the 12h limit for a job to be inflight.
you can only have 20k inflights job at a time

So one idea to mitigate the cons would be to setup a redrive policy to a dlq if the job fails too many times.

AWS Beanstalk: Exponential backoff for SQS?

Tags:

php

amazon-web-services

amazon-sqs

amazon-elastic-beanstalk

Elliot Chance

2 Answers

Elliot Chance

metakungfu

Recent Activity

Donate For Us

AWS Beanstalk: Exponential backoff for SQS?

Tags:

php

amazon-web-services

amazon-sqs

amazon-elastic-beanstalk

Elliot Chance

2 Answers

Elliot Chance

metakungfu

Related questions

Recent Activity

Donate For Us