We are using the worker tier on Beanstalk to send out webhooks. We need to use exponential backoff in case any error when contacting the third party. However, it is unclear to me how this would work.
If the job fails and I invoke a ChangeMessageVisibility
to some increasing time backoff time I have two choices:
From Environment Tiers - AWS Beanstalk:
A web application in a worker environment tier should only listen on the local host. When the web application in the worker environment tier returns a 200 OK response to acknowledge that it has received and successfully processed the request, the daemon sends a DeleteMessage call to the SQS queue so that the message will be deleted from the queue. (SQS automatically deletes messages that have been in a queue for longer than the configured RetentionPeriod.) If the application returns any response other than 200 OK, then Elastic Beanstalk waits to put the message back in the queue after the configured VisibilityTimeout period. If there is no response, then Elastic Beanstalk waits to put the message back in the queue after the InactivityTimeout period so that the message is available for another attempt at processing.
ChangeMessageVisibility
has a limit of 12 hours and only applies to inflight jobs (jobs that while they are running you want to notify SQS "I need more time to complete this").
The only solution is to create a new job in the queue with the same details and an additional counter for retries (in the message or as an attribute) and use the DelaySeconds with an exponential backoff based on retries + 1
.
Unfortunately DelaySeconds
has a limit of 15 minutes (900 seconds) so for you to schedule a job longer than that you have a few options:
There are pros and cons to increasing the ChangeMessageVisibility of a failing job:
Pros:
Cons:
So one idea to mitigate the cons would be to setup a redrive policy to a dlq if the job fails too many times.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With