Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQS Messages never gets removed/deleted after script run

I'm having issues where my SQS Messages are never deleted from the SQS Queue. They are only removed when the lifetime ends, which is 4 days.

So to summarize the app:

  • Send URL to SQS Queue to wait to be crawled

  • Send message to Elastic Beanstalk App that crawls the data and store it in database

The script seems to be working in the meaning that it does receive the message, and it does crawl it successfully and store the data successfully in the database. The only issue is that the messages remain in the queue, stuck at "Message Available".

So if I for example load the queue with 800 messages, it will be stuck at ~800 messages for 4 days and then they will all be deleted instantly because of the lifetime value. It seems like a few messages get deleted because the number changes slightly, but a large majority is never removed from the queue.

So question:

  • Isn't SQS supposed to remove the message as soon as it has been send and received by the script?

  • Is there a manual way for me to in the script itself, delete the current message? From what I know the message is only sent 1 way. From SQS -> App. So from what I know, I can not do SQS <-> App.

Any ideas?

like image 903
Marcus Lind Avatar asked Jun 17 '14 08:06

Marcus Lind


People also ask

Does SQS delete message automatically?

Amazon SQS doesn't automatically delete a message after retrieving it for you, in case you don't successfully receive the message (for example, if the consumers fail or you lose connectivity).

What happens if SQS message is not deleted?

Messages in Amazon SQS have a limited amount of time to be received and processed before they are automatically deleted by SQS. If a message is not processed before the message retention period has passed, it will be permanently deleted.

How long do messages stay on SQS?

You can configure the Amazon SQS message retention period to a value from 1 minute to 14 days. The default is 4 days. Once the message retention quota is reached, your messages are automatically deleted.


2 Answers

A web application in a worker environment tier should only listen on the local host. When the web application in the worker environment tier returns a 200 OK response to acknowledge that it has received and successfully processed the request, the daemon sends a DeleteMessage call to the SQS queue so that the message will be deleted from the queue. (SQS automatically deletes messages that have been in a queue for longer than the configured RetentionPeriod.) If the application returns any response other than 200 OK or there is no response within the configured InactivityTimeout period, SQS once again makes the message visible in the queue and available for another attempt at processing.

http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features-managing-env-tiers.html

So I guess that answers my question. Some messages do not return HTTP 200 and then they are stuck in an infinite loop.

like image 127
Marcus Lind Avatar answered Nov 15 '22 10:11

Marcus Lind


No the messages won't get deleted when you read a Queue Item; it is only hidden for a specific amount of time it is called as Visibility Timeout. The idea behind visibility timeout is to ensure that if there are multiple consumers for a single queue, no two consumer pick the same item and start processing.

The is the change you need to do your app to get the expected behavior

  1. Send URL to SQS Queue to wait to be crawled
  2. Send message to Elastic Beanstalk App that crawl the data and store it in database
  3. On the event of successful crawled status, use the receipt-handle(not the message id) and delete the Queue Item from the Queue.

AWS Documentation - DeleteMessage

like image 32
Naveen Vijay Avatar answered Nov 15 '22 11:11

Naveen Vijay