Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS on Terraform: Error deleting resource: timeout while waiting for state to become 'destroyed'

Tags:

I'm using Terraform (v0.12.28) to launch my AWS environment (aws provider v2.70.0).
When I try to remove all resources with terraform destroy I'm facing the error below:

error deleting subnet (subnet-XXX): timeout while waiting for state to become 'destroyed' (last state: 'pending', timeout: 20m0s)

I can add my Terraform code but I think there is nothing special in my resources stack which basically includes:

  1. VPC and Subnets.
  2. Internet and NAT GTWs.
  3. Application Load Balancers.
  4. Route tables.
  5. Auto-generated NACL and Elastic Network Interfaces (ENIs).

In my case, the problem seems to be related to the ENIs which are attached to the ALBs - as can be seen from the AWS console:

enter image description here

While searching for solutions I noticed that it is a common problem which can come in different resources and type of dependencies.

I'll focus in this question to problems which are related to VPC components (Subnets, ENIs etc') and resources that have dependency on them (Load Balancers, EC2,Lambda functions etc') and are failing to be deleted probably due to the fact that a detaching phase is required prior to the deletion.

Any help will be highly appreciated.


(*) The Terraform user for this environment (DEV) has full Admin privileges:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "*",
            "Resource": "*"
        }
    ]
}

So this shouldn't be related to policies.


Examples for related issues:

Update: Issue affecting HashiCorp Terraform resource deletions after the VPC Improvements to AWS Lambda (Solution doesn't work - I've an updated version of AWS provider).

AWS VPC - cannot detach "in use" AWS Lambda VPC ENI

Lambda Associated EC2 Subnet and Security Group Deletion Issues and Improvements

AWS: deletion of subnet times out because of scaling group

Error waiting for route table (rtb-xxxxxx) to become destroyed: timeout while waiting for state to become

Error waiting for internet gateway to detach / Cluster has node groups attached

like image 218
RtmY Avatar asked Jul 28 '20 07:07

RtmY


1 Answers

I ran into this issue while trying to destroy an EKS cluster after I had already deployed services onto the cluster, specifically a load balancer. To fix this I manually deleted the load balancer and the security group associated to the load balancer.

Terraform is not aware of the resources provisioned by k8s and will not clean up dependent resources.

If you're unsure what resources are preventing Terraform from destroying infrastructure you can try an of:

  • Use terraform apply to get back into a good state and then use kubectl to clean up resources before running terraform destroy again.
  • This knowledge base article includes a script you can run to identify dependencies: https://aws.amazon.com/premiumsupport/knowledge-center/troubleshoot-dependency-error-delete-vpc/
  • Review CloudTrail logs to see what resources were created. If this was an issue with EKS you can filter by username: AmazonEKS.

Another variation of this issue is a DependencyViolation error. Ex:

Error deleting VPC: DependencyViolation: The vpc 'vpc-xxxxx' has dependencies and cannot be deleted. status code: 400

like image 89
Mathew Tinsley Avatar answered Oct 20 '22 15:10

Mathew Tinsley