Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is ApproximateAgeOfOldestMessage in SQS not bigger than approx 5 mins

I am utilising spring cloud aws messaging (2.0.1.RELEASE) in java to consume from an SQS queue. If it's relevant we use default settings, java 10 and spring cloud Finchley.SR2,

We recently had an issue where a message could not be processed due to an application bug, leading to an exception and no confirmation (deletion) of the message. The message is later retried (this is desirable) presumably after the visibility timeout has elapsed (again default values are in use), we have not customised the settings here.

We didn't spot the error above for a few days, meaning the message receive count was very high and the message had conceptually been on the queue for a while (several days by now). We considered creating a cloud watch SQS alarm to alert us to a similar situation in future. The only suitable metric appeared to be ApproximateAgeOfOldestMessage.

Sadly, when observing this metric I see this:

enter image description here

The max age doesn't go much above 5 mins (despite me knowing it was several days old). If a message is getting older each time a receive happens, assuming no acknowledgment comes and the message isn't deleted - but is instead becoming available again after the visibility timeout has elapsed should this graph not be much much higher?

I don't know if this is something specific to thew way that spring cloud aws messaging consumes the message or whether it's a general SQS quirk, but my expectation was that if a message was put on the queue 5 days ago, and a consumer had not successfully consumed the message then the max age would be 5 days?

Is it in fact the case that if a message is received by a consumer, but not ultimately deleted that the max age is actually the length between consume calls?

Can anyone confirm whether my expectation is incorrect, i.e. this is indeed how SQS is expected to behave (it doesn't consider the age to be the duration of time since the message was first put on the queue, but instead considers it to be the time between receive calls?

like image 227
David Avatar asked Nov 26 '18 19:11

David


People also ask

Is there any size limit for the message in the SQS?

The maximum is 262,144 bytes (256 KiB). To send messages larger than 256 KB, you can use the Amazon SQS Extended Client Library for Java . This library allows you to send an Amazon SQS message that contains a reference to a message payload in Amazon S3.

Is SQS fault tolerance?

Availability Zones are more highly available, fault tolerant, and scalable than traditional single or multiple data center infrastructures. For more information about AWSAWSAmazon Web Services, Inc. (AWS) is a subsidiary of Amazon that provides on-demand cloud computing platforms and APIs to individuals, companies, and governments, on a metered pay-as-you-go basis. These cloud computing web services provide distributed computing processing capacity and software tools via AWS server farms.https://en.wikipedia.org › wiki › Amazon_Web_ServicesAmazon Web Services - Wikipedia Regions and Availability Zones, see AWS Global Infrastructure . In addition to the AWS global infrastructure, Amazon SQS offers distributed queues.

What is the minimum and maximum SQS delay queue setting?

If you create a delay queue, any messages that you send to the queue remain invisible to consumers for the duration of the delay period. The default (minimum) delay for a queue is 0 seconds. The maximum is 15 minutes.

What is the AWS recommended way of managing large messages in SQS?

You can use the Amazon SQS Extended Client Library for Java to do the following: Specify whether messages are always stored in Amazon S3 or only when the size of a message exceeds 256 KB. Send a message that references a single message object stored in an S3 bucket. Retrieve the message object from an S3 bucket.


2 Answers

Based on a similar question on AWS forums, this is apparently a bug with regular SQS queues where only a single message is affected.

In order to have a useful alarm for this issue, I would suggest setting up a dead-letter-queue (where messages get automatically delivered after a configurable number of consume-without-deletes), and alarm on the size of the dead-letter-queue (ApproximateNumberOfMessagesVisible).

like image 106
Krease Avatar answered Sep 29 '22 10:09

Krease


I think this might have to do with the poison pill handling by this metric. After 3+ tries, the message won't be included in the metric. From the AWS docs:

After a message is received three times (or more) and not processed, the message is moved to the back of the queue and the ApproximateAgeOfOldestMessage metric points at the second-oldest message that hasn't been received more than three times. This action occurs even if the queue has a redrive policy.

like image 39
John K Avatar answered Sep 29 '22 11:09

John K