Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best practice for polling an AWS SQS queue and deleting received messages from queue?

I have an SQS queue that is constantly being populated by a data consumer and I am now trying to create the service that will pull this data from SQS using Python's boto.

The way I designed it is that I will have 10-20 threads all trying to read messages from the SQS queue and then doing what they have to do on the data (business logic), before going back to the queue to get the next batch of data once they're done. If there's no data they will just wait until some data is available.

I have two areas I'm not sure about with this design

  1. Is it a matter of calling receive_message() with a long time_out value and if nothing is returned in the 20 seconds (maximum allowed) then just retry? Or is there a blocking method that returns only once data is available?
  2. I noticed that once I receive a message, it is not deleted from the queue, do I have to receive a message and then send another request after receiving it to delete it from the queue? seems like a little bit of an overkill.

Thanks

like image 929
Mo. Avatar asked Jul 09 '15 15:07

Mo.


People also ask

Do we need to delete message from SQS?

Depending on your application's needs, you might have to use short or long polling to receive messages. Amazon SQS doesn't automatically delete a message after retrieving it for you, in case you don't successfully receive the message (for example, if the consumers fail or you lose connectivity).

Do we need to poll SQS?

Long polling helps reduce the cost of using Amazon SQS by eliminating the number of empty responses (when there are no messages available for a ReceiveMessage request) and false empty responses (when messages are available but aren't included in a response).

How do I clean my SQS queue?

To purge a queue, log in to the AWS Management Console and choose Amazon SQS. Then, select a queue, and choose “Purge Queue” from the Queue Actions menu. The queue will then be cleared of all messages. You can also purge queues using the AWS SDKs or command-line tools.

Which is generally preferred SQS long polling or SQS short polling?

From SQS FAQs: In almost all cases, Amazon SQS long polling is preferable to short polling. Long-polling requests let your queue consumers receive messages as soon as they arrive in your queue while reducing the number of empty ReceiveMessageResponse instances returned.


2 Answers

The long-polling capability of the receive_message() method is the most efficient way to poll SQS. If that returns without any messages, I would recommend a short delay before retrying, especially if you have multiple readers. You may want to even do an incremental delay so that each subsequent empty read waits a bit longer, just so you don't end up getting throttled by AWS.

And yes, you do have to delete the message after you have read or it will reappear in the queue. This can actually be very useful in the case of a worker reading a message and then failing before it can fully process the message. In that case, it would be re-queued and read by another worker. You also want to make sure the invisibility timeout of the messages is set to be long enough the the worker has enough time to process the message before it automatically reappears on the queue. If necessary, your workers can adjust the timeout as they are processing if it is taking longer than expected.

like image 195
garnaat Avatar answered Oct 17 '22 02:10

garnaat


If you want a simple way to set up a listener that includes automatic deletion of messages when they're finished being processed, and automatic pushing of exceptions to a specified queue, you can use the pySqsListener package.

You can set up a listener like this:

from sqs_listener import SqsListener

class MyListener(SqsListener):
    def handle_message(self, body, attributes, messages_attributes):
        run_my_function(body['param1'], body['param2']

listener = MyListener('my-message-queue', 'my-error-queue')
listener.listen()

There is a flag to switch from short polling to long polling - it's all documented in the README file.

Disclaimer: I am the author of said package.

like image 12
ygesher Avatar answered Oct 17 '22 01:10

ygesher