I have multiple sources which are pushing raw data to S3. I have configured a SQS event notification over my S3 bucket. The problem is the lag and limitations.
I anticipate that there will be more sources in near future and since we can get only 10 messages in a single poll from SQS, I think that in the near future when there will be more sources that will push data to S3, then the SQS will be full of some thousands of messages and I won't be able to process them faster.
I am thinking to fan-out SQS by spreading the message to more SQS queues from my master SQS queue, so that my processing layer can poll multiple queues eg: 5 queues and process more messages. What should be the probable approach?
"... since we can get only 10 Messages in a single poll from SQS...I am thinking to fan-out sqs like spreading the message to more SQS queues from my master SQS queue, so that my processing layer can poll multiple queues eg : 5 queues and process more messages."
Short Answer: Don't do this.
Here's why:
Yes, a single poll can retrieve up to 10 messages. However, you can have multiple threads and multiple hosts all polling a single queue. Getting your consumers to run in parallel is the key here, as processing queue entries will be your bottleneck - not retrieving entries from the queue. A single SQS queue can handle tons of polling threads.
A multi-queue fanout as you proposed would have a number of drawbacks:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With