I'm using serverless to deploy an application where I use a Custom Resource to migrate a RDS database.
Everything works while I deploy, but when I delete the stack the Custom Resource timeouts after an hour with the message "Custom Resource failed to stabilize in expected time.". The request to the pre-signed AWS S3 URL returns 403
with the error code AccessDenied
.
My first sent, successfull, response body to the pre-signed URL (upon Create):
{
"Status": "SUCCESS",
"RequestId": "bd487606-8017-49f2-99af-b29b2bbad40b",
"LogicalResourceId": "SheltersDBMigrationTrigger",
"StackId": "arn:aws:cloudformation:us-east-1:848139458219:stack/update-shelters-dev/c08a80e0-2e4e-11e9-87a6-124d1eab42ba",
"PhysicalResourceId": "DB_MIGRATION"
}
My second sent, failing, response body to the pre-signed URL (upon Delete):
{
"Status": "SUCCESS",
"RequestId": "2d166d36-7c0c-4848-9eb5-aedaf5e9172c",
"LogicalResourceId": "SheltersDBMigrationTrigger",
"StackId": "arn:aws:cloudformation:us-east-1:848139458219:stack/update-shelters-dev/c08a80e0-2e4e-11e9-87a6-124d1eab42ba",
"PhysicalResourceId": "DB_MIGRATION"
}
lambda.go:
func handler(ctx context.Context, event cfn.Event) (rid string, data map[string]interface{}, err error) {
rid = "DB_MIGRATION"
if event.RequestType != cfn.RequestCreate {
return
}
db, err := sql.Open("mysql", fmt.Sprintf("%s:%s@(%s)/", os.Getenv("DB_MASTER_USER"), os.Getenv("DB_MASTER_PASSWORD"), os.Getenv("DB_ADDRESS")))
if err != nil {
panic(err)
}
defer db.Close()
defer func() {
if r := recover(); r != nil {
err = fmt.Errorf("handler: Failed to migrate DB: %v", r)
}
}()
MigrateDb(db)
return
}
func main() {
lambda.Start(cfn.LambdaWrap(handler))
}
serverless config for Lambda CFN:
functions:
dbMigration:
handler: lambda-bin/migrate-db
environment:
DB_MASTER_USER: ${env:DB_MASTER_USER}
DB_MASTER_PASSWORD: ${env:DB_MASTER_PASSWORD}
DB_ADDRESS:
"Fn::GetAtt": [ SheltersDB, Endpoint.Address ]
vpc:
securityGroupIds:
- Ref: SheltersVPCSecurityGroup
subnetIds:
- Ref: SheltersSubnet1
- Ref: SheltersSubnet2
...
Resources:
SheltersDBMigrationTrigger:
Type: Custom::DBMigration
DependsOn:
- SheltersDB
Properties:
ServiceToken: !GetAtt
- DbMigrationLambdaFunction
- Arn
SheltersSubnet1:
Type: AWS::EC2::Subnet
Properties:
AvailabilityZone: !Select [ 0, {Fn::GetAZs: ""} ]
CidrBlock: 10.0.1.0/24
VpcId: !Ref SheltersVPC
SheltersSubnet2:
Type: AWS::EC2::Subnet
Properties:
AvailabilityZone: !Select [ 1, {Fn::GetAZs: ""} ]
CidrBlock: 10.0.2.0/24
VpcId: !Ref SheltersVPC
SheltersVPCSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: "Security group for DB connections"
VpcId: !Ref SheltersVPC
SheltersVPCSecurityGroupIngress:
Type: AWS::EC2::SecurityGroupIngress
Properties:
GroupId: !Ref SheltersVPCSecurityGroup
IpProtocol: tcp
FromPort: "3306"
ToPort: "3306"
SourceSecurityGroupId: !Ref SheltersVPCSecurityGroup
SheltersVPC:
Type: AWS::EC2::VPC
Properties:
CidrBlock: 10.0.0.0/16
SheltersRouteTable:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref SheltersVPC
SheltersSubnet1Association:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref SheltersSubnet1
RouteTableId: !Ref SheltersRouteTable
SheltersSubnet2Association:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref SheltersSubnet2
RouteTableId: !Ref SheltersRouteTable
SheltersVPCS3Endpoint:
Type: AWS::EC2::VPCEndpoint
Properties:
VpcId: !Ref SheltersVPC
PolicyDocument: "{\"Version\":\"2008-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":\"*\",\"Action\":\"*\",\"Resource\":\"*\"}]}"
RouteTableIds:
- !Ref SheltersRouteTable
ServiceName: !Join ['', ['com.amazonaws.', !Ref 'AWS::Region', '.s3']]
Here's a gist with my full source files and log.
Update with identified problem
It seems that my VPCEndpoint to S3, SheltersVPCS3Endpoint
, is getting deleted before dBMigration
and that's why I receive 403.
With pure Cloudformation I guess this could've been solved easy by putting a DependsOn
onto dbMigration
, but with serverless it seems that's not possible.
If the permissions between a Lambda function and an Amazon S3 bucket are incomplete or incorrect, then Lambda returns an Access Denied error.
AWS CloudFormation invokes your Lambda function asynchronously with an event that includes a callback URL. The function is responsible for returning a response to the callback URL that indicates success or failure. For the full response syntax, see Custom resource response objects.
When an API Gateway REST API with a Lambda authorizer returns a 403 error, it's usually for one of the following reasons: The call to your API has an invalid token or identity sources, with missing, null, or empty values.
The cfn-response module is available only when you use the ZipFile property to write your source code. It isn't available for source code that's stored in Amazon S3 buckets. For code in buckets, you must write your own functions to send responses.
When you use the ZipFile property to specify your function's source code and that function interacts with an AWS CloudFormation custom resource, you can load the cfn-response module to send responses to those resources.
For example, Amazon S3 batch operations retries the operation if the Lambda function returns a TemporaryFailure response code. Services that proxy requests from an upstream user or client may have a retry strategy or may relay the error response back to the requestor.
After a long investigation together with AWS support we found out that SheltersVPCS3Endpoint
was deleted before dbMigration
was deleted, and therefore the Lambda fn couldn't get any contact with the S3 bucket which triggered a timeout.
Since it's not possible to add any DependsOn
to functions in Serverless, I had to migrate from Serverless to Cloudformation. When I added the following, it seems to be solved.
DbMigrationLambdaFunction:
DependsOn:
- SheltersVPCS3Endpoint
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With