I have a lambda function that will be called infrequently in Production, but it will be public-facing, so I want to avoid cold-starts. So I thought I could use provisioned concurrency to avoid this issue. My Cloudformation template looks as follows:
QuoteLinkServiceFunction:
Type: AWS::Serverless::Function
Properties:
# other lambda properties...
ProvisionedConcurrencyConfig:
ProvisionedConcurrentExecutions: 1
When I create this stack in my Test environment though (where I am the only user, and so there are no other calls happening concurrently), I still experience cold starts when returning to use this function after a few hours. Subsequent calls immediately after the first call run faster as the lambda is now warmed up.
The lambda console shows that the alias for this function has actually been set up with a provisioned concurrency of 1, and I have verified the ALB target group is pointed at the alias. So why am I still getting cold starts?
tl;dr:
We're experiencing the same issue, and haven't found any way around it. In the end, Lambda instances are transient, so there's no guaranteed continuous uptime (even with provisioned concurrency).
What provisioned concurrency does give you though is the guarantee of a number of running instances - although these can be swapped with other instances at any point in time (and incur a cold start when that happens). The frequency of the swaps seems pretty arbitrary, and I assume, completely up to AWS.
EDIT: We eventually realized this isn't an issue at all! It just has to do with the nature of how provisioned concurrency functions:
With provisioned concurrency, initialization/cold starts still happen, but they happen before the Lambda is made available to be invoked.
However, clients could still experience another form of cold start if the lambda function doesn't makes good usage of the static initialization -- which is often slower than the Lambda initialization itself:
In our analyses of Lambda performance across production invocations, data shows that the largest contributor of latency before function execution comes from INIT code.
All that is summed up pretty well by the images below (from the Lambda Performance Optimization Guide).
A good way to tell if Lambda is indeed doing a cold start is to look at the logs in CloudWatch. Each request should have a REPORT
log that looks like this:
REPORT RequestId: f840a316-cf35-42ec-8f4d-c03a6cde9192 Duration: 368.80 ms Billed Duration: 369 ms Memory Size: 128 MB Max Memory Used: 93 MB Init Duration: 3569.10 ms
If you see Init Duration
at the end of the log, then it is indeed a cold start. However, with provisioned concurrency, this init duration happened before the Lambda was invoked.
Also, a new CloudWatch log stream seems to be created each time AWS spins up a new Lambda "instance" - which incurs a cold start, confirmed by the fact that the first request of each log stream has an Init Duration
. So just taking a look at the "First event time" column will show you all your cold starts (the column can be added via the preferences/gear icon).
This is further confirmed in the Lambda Performance Optimization Guide:
Initialization code is run more frequently than the total number of invocations. Since Lambda is highly available, for every one unit of Provisioned Concurrency, there are a minimum of two execution environments prepared in separate Availability Zones. This is to ensure that your code is available in the event of a service disruption. As environments are reaped and load balancing occurs, Lambda over-provisions environments to ensure availability. You are not charged for this activity. If your code initializer implements logging, you will see additional log files anytime that this code is run, even though the main handler is not invoked.
It can also be a good idea to look at the START
log, to make sure that the intended version is being called (the one with provisioned concurrency configured):
START RequestId: f840a316-cf35-42ec-8f4d-c03a6cde9192 Version: 15
It's especially important to make sure that the version is not $LATEST
(which cannot benefit from provisioned concurrency):
Each version of a function can only have one provisioned concurrency configuration. This can be directly on the version itself, or on an alias that points to the version. Two aliases can't allocate provisioned concurrency for the same version. Also, you can't allocate provisioned concurrency on an alias that points to the unpublished version ($LATEST).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With