Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do SqS messages sometimes remain in-flight on queue

I'm using Amazon SQS queues in a very simple way. Usually, messages are written and immediately visible and read. Occasionally, a message is written, and remains In-Flight(Not Visible) on the queue for several minutes. I can see it from the console. Receive-message-wait time is 0, and Default Visibility is 5 seconds. It will remain that way for several minutes, or until a new message gets written that somehow releases it. A few seconds delay is ok, but more than 60 seconds is not ok.

There a 8 reader threads that are long polling always, so its not that something is not trying to read it, they are.

Edit : To be clear, none of the consumer reads are returning any messages at all and it happens regardless of whether or not the console is open. In this scenario, only one message is involved, and it is just sitting in the queue invisible to the consumers.

Has anyone else seen this behavior and what I can do to improve it?

Here is the sdk for java I am using:

<dependency>
  <groupId>com.amazonaws</groupId>
  <artifactId>aws-java-sdk</artifactId>
  <version>1.5.2</version>
</dependency>     

Here is the code that does the reading (max=10,maxwait=0 startup config):

void read(MessageConsumer consumer) {

  List<Message> messages = read(max, maxWait);

  for (Message message : messages) {
    if (tryConsume(consumer, message)) {
      delete(message.getReceiptHandle());
    }
  }
}

private List<Message> read(int max, int maxWait) {

  AmazonSQS sqs = getClient();
  ReceiveMessageRequest rq = new ReceiveMessageRequest(queueUrl);
  rq.setMaxNumberOfMessages(max);
  rq.setWaitTimeSeconds(maxWait);
  List<Message> messages = sqs.receiveMessage(rq).getMessages();

  if (messages.size() > 0) {
    LOG.info("read {} messages from SQS queue",messages.size());
  }

  return messages;
}

The log line for "read .." never appears when this is happening, and its what causes me to go in with the console and see if the message is there or not, and it is.

like image 581
Jerico Sandhorn Avatar asked Nov 05 '13 15:11

Jerico Sandhorn


People also ask

How long do messages stay in flight SQS?

To prevent other consumers from processing the message again, Amazon SQS sets a visibility timeout, a period of time during which Amazon SQS prevents other consumers from receiving and processing the message. The default visibility timeout for a message is 30 seconds. The minimum is 0 seconds. The maximum is 12 hours.

What are the inflight messages in case of Amazon SQS?

Inflight Messages Inflight messages are messages in SQS that have been received by a consumer but not yet deleted. Each SQS queue is limited to 120,000 inflight messages, or 20,000 if it is a FIFO queue. When sending a message to a queue with too many inflight messages, SQS returns the "OverLimit" error message.

Why do SQS messages go to dead-letter queue?

If the consumer of the source queue receives a message 6, without successfully consuming it, SQS moves the message to the dead-letter queue. You can configure an alarm to alert you when any messages are delivered to a DLQ. You can then examine logs for exceptions that might have caused them to be delivered to the DLQ.


2 Answers

It sounds like you are misinterpreting what you are seeing.

Messages "in flight" are not pending delivery, they're messages that have already been delivered but not further acted on by the consumer.

Messages are considered to be in flight if they have been sent to a client but have not yet been deleted or have not yet reached the end of their visibility window.

— https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-available-cloudwatch-metrics.html

When a consumer receives a message, it has to -- at some point -- either delete the message, or send a request to increase the timeout for that message; otherwise the message becomes visible again after the timeout expires. If a consumer fails to do one of these things, the message automatically becomes visible again. The visibility timeout is how long the consumer has before one of these things must be done.

Messages should not be "in flight" without something having already received them -- but that "something" can include the console itself, as you'll note on the pop-up you see when you choose "View/Delete Messages" in the console (unless you already checked the "Don't show this again" checkbox):

Messages displayed in the console will not be available to other applications until the console stops polling for messages.

Messages displayed in the console are "in flight" while the console is observing the queue from the "View/Delete Messages" screen.

The part that does not make obvious sense is messages being in flight "for several minutes" if your default visibility timeout is only 5 seconds and nothing in your code is increasing that timeout... however... that could be explained almost perfectly by your consumers not properly disposing of the message, causing it to timeout and immediately be redelivered, giving the impression that a single instance of the message was remaining in-flight, when in fact, the message is briefly transitioning back to visible, only to be claimed almost immediately by another consumer, taking it back to in-flight again.

like image 123
Michael - sqlbot Avatar answered Oct 22 '22 03:10

Michael - sqlbot


It may happen when you send or lock a message and within some seconds you try to get the fresh list of messages. Amazon SQS stores the data into multiple servers and in multiple data centers http://aws.amazon.com/sqs/faqs/#How_reliably_is_my_data_stored_in_Amazon_SQS.

To get rid of these issues you need to wait more so that queue would have more time to give appropriate results.

like image 30
Satish Pandey Avatar answered Oct 22 '22 03:10

Satish Pandey