How does AWS Lambda serve multiple requests? I want to know is it a multi-thread kind of a model here as well?
If I am calling a Lambda from an API gateway. And there are 1000 requests in 10 secs to the API. How many containers will be created and how many threads.
With increased concurrent execution limit, there is still one more limit the Burst Concurrency limit. This will limit lambda to serve only 3000 concurrent request at time. If it receives more than 3000 concurrent requests some of them will be throttled until lambda scales by 500 per minute.
You can code your Lambda function to process each record of a batch in a loop and perform some action for each. After all the records are processed, the function may return a response payload. This could be the string "Success" or another message or data or None.
Concurrency is the number of requests that your function is serving at any given time. When your function is invoked, Lambda allocates an instance of it to process the event. When the function code finishes running, it can handle another request.
The Lambda free tier includes 1M free requests per month and 400,000 GB-seconds of compute time per month.
How does AWS Lambda serve multiple requests?
Independently.
I want to know is it a multi-thread kind of a model here as well?
No, it is not a multi-threaded model in the sense that you are asking.
Your code can, of course, be written to use multiple threads and/or child processes to accomplish whatever purpose it is intended to accomplish for one invocation, but Lambda doesn't send more than one invocation at a time to the same container. The container is not used for a second invocation until the first one finishes. If a second request arrives while a first one is running, the second one will run in a different container.
If I am calling a Lambda from an API gateway. And there are 1000 requests in 10 secs to the API. How many containers will be created and how many threads?
As many containers will be created as are needed to process each of the arriving requests in its own container.
The duration of each invocation will be the largest determinant of this.
1000 very quick requests in 10 seconds are roughly equivalent to 100 requests in 1 second. Assuming each request finishes in less than 1 second and arrival times are evenly-distributed, you could expect fewer than 100 containers to be created.
On the other hand, if 1000 requests arrived in 10 seconds and each request took 30 seconds to complete, you would have 1000 containers in existence during this event.
After a spike in traffic inflates the number of containers, they will all tend to linger for a few minutes, ready to handle the additional load if it arrives, and then Lambda will start terminating them.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With