I'm looking for help with an architectural design decision I'm making with a product.
We've got multiple producers (initiated by API Gateway calls into Lambda) that put messages on a SQS queue (the request queue). There can be multiple simultaneous calls, so there would be multiple Lambda instances running in parallel.
Then we have consumers (lets say twenty EC2 instances) who long-poll on the SQS for the message to process them. They take about 30-45 seconds to process a message each.
I would then ideally like to send the response back to the producer that issued the request - and this is the part I'm struggling with with SQS. I would in theory have a separate response queue that the initial Lambda producers would then be consuming, but there doesn't seem to be a way to cherry pick the specific correlated response. That is, each Lambda function might pick up another function's response. I'm looking for something similar to this design pattern: http://soapatterns.org/design_patterns/asynchronous_queuing
The only option that I can see is to create a new SQS Response queue for each Lambda API call, passing in its ARN in the message for the consumers to put the response on, but I can't imagine that's very efficient - especially when there's potentially hundreds of messages a minute? Am I missing something obvious?
I suppose the only other alternative would be setting up a bigger message broker (e.g. RabbitMQ/ApacheMQ) environment, but I'd like to avoid that if possible.
Thanks!
Amazon SQS is a fully managed message queuing service that makes it easy to decouple and scale microservices, distributed systems, and serverless applications. Asynchronous workflows have always been the primary use case for SQS.
With Amazon SQS, you can offload tasks from one component of your application by sending them to a queue and processing them asynchronously. Lambda polls the queue and invokes your Lambda function synchronously with an event that contains queue messages.
Delay queues let you postpone the delivery of new messages to consumers for a number of seconds, for example, when your consumer application needs additional time to process messages. If you create a delay queue, any messages that you send to the queue remain invisible to consumers for the duration of the delay period.
SQS is mainly used to decouple applications or integrate applications. SNS is used to broadcast messages and it's up to the receivers how they interpret and process those messages.
Create a (Temporary) Response Queue For Every Request
To late for the party, but i was thinking that i might find some help in what i want to achieve, @MattHouser @Zaheer Ally , or give an idea to someone working on a related issue.
I am facing a similar challenge. I have an API that upon request by a client, needs to communicate to multiple external APIs and collect (delayed) results.
Since my PHP API is synchronous, it can only perform these requests sequentially. So, i was thinking to use a request queue, where the producer (API) would send messages. Then, multiple workers would consume these messages, each of them performing one of these external API calls.
To get the results back, the producer would have created a temporary response queue, the name-identifier of which would be embedded in the message sent to workers. Hence, each worker would 'publish' his results on this temporary queue.
In the meantime, the producer would keep polling the temporary queue until he received the expected number of messages. Finally, he would delete the queue and send the collected results back to the client.
Yes, you could use RabbitMQ for a more "rpc" queue pattern.
But if you want to stay within AWS, try using something other than SQS for the response.
Instead, you could use S3 for the response. When your producer puts the item into SQS, include in the message an S3 destination for the response. When your consumer completes the tasks, put the response in the desired S3 location.
Then you can check S3 for the response.
Update
You may be able to accomplish an RPC-like message queue using Redis.
https://github.com/ServiceStack/ServiceStack/wiki/Messaging-and-redis
Then, you can use AWS ElastiCache for your Redis cluster. This would completely replace the use of SQS.
Another option would be to use Redis' pub/sub mechanism to asynchronously notify your lambda that the backend work is done. You can use AWS's Elasticache for Redis for an all-AWS-managed solution. Your lambda function would generate a UUID for each request, use that as the channel name to subscribe to, pass it along in the SQS message, and then the backend workers would publish a notification to that channel when the work is done.
I was facing this same problem so I tried it out, and it does work. Whether it's worth the effort over just polling S3 is another question. You have to configure the lambda functions to run inside your VPC, so they can access your Redis. I was going to have to do this anyway since I'd want the workers, in my case also lambda functions, to be able to access my Elasticsearch and RDS. But there are some considerations: most importantly, you need to use a private subnet with a NAT Gateway (or your own NAT Instance), so it can get out to the Internet and AWS managed services (including SQS).
One other thing I just stumbled across is that requests through API Gateway currently cannot take longer than 29 seconds, and this cannot be increased by AWS. You mentioned your jobs take 30 or more seconds, so this could be a showstopper for you using API Gateway and Lambda in this way anyway.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With