Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The nodes doesn't attach to the EKS cluster via Terraform

I'm using 0.14.2 Terraform version. I'm trying to deploy an EKS cluster with two nodes with this code:

resource "aws_eks_cluster" "cluster" {
  enabled_cluster_log_types = []
  name                      = var.cluster_name
  role_arn                  = aws_iam_role.cluster.arn
  version                   = var.eks_version
  vpc_config {
    subnet_ids              = flatten([ aws_subnet.private.*.id, aws_subnet.public.*.id ])
    security_group_ids      = []
    endpoint_private_access = "true"
    endpoint_public_access  = "true"
  }
  tags = var.tags[terraform.workspace]

  depends_on = [
    aws_iam_role_policy_attachment.cluster_AmazonEKSClusterPolicy,
    aws_iam_role_policy_attachment.cluster_AmazonEKSServicePolicy,
    aws_cloudwatch_log_group.cluster
  ]
}

resource "aws_launch_configuration" "eks-managenodes" {
  for_each                    = local.ob
  
  name_prefix                 = "${var.cluster_name}-launch-${each.value}"
  image_id                    = "ami-038341f2c72928ada"
  instance_type               = "t3.medium"
  user_data = <<-EOF
      #!/bin/bash
      set -o xtrace
      /etc/eks/bootstrap.sh ${var.cluster_name}
      EOF

  root_block_device {
    delete_on_termination = true
    volume_size = 30
    volume_type = "gp2"
  }

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_autoscaling_group" "eks-asg" {
  for_each        = local.ob

  desired_capacity     = 1
  launch_configuration = aws_launch_configuration.eks-managenodes[each.value].id
  max_size             = 1
  min_size             = 1
  name                 = "${var.cluster_name}-node-${each.value}"
  vpc_zone_identifier  = aws_subnet.private.*.id

  tag {
    key                 = "Name"
    value               = "eks-manage-node-${each.value}"
    propagate_at_launch = true
  }

  tag {
    key                 = "kubernetes.io/cluster/${var.cluster_name}"
    value               = "owned"
    propagate_at_launch = true
  }
  depends_on = [
    aws_launch_configuration.eks-managenodes,
    aws_eks_cluster.cluster
  ]
}

Then, the cluster deploy fine, the ASG and the EC2 instances deploy fine, but the problem is that these instances doesn't attach to the corresponding cluster and I don't find the problem..

Any idea? Thanks

like image 774
Humberto Lantero Avatar asked Dec 05 '25 03:12

Humberto Lantero


1 Answers

Nodes can fail to join a cluster for a variety of reasons.

  1. a failure during cloud-init may be preventing them from registering with the cluster control plane.
  2. there may be IAM authentication failures.

Debugging steps:

  1. Ssh into a node and check /var/log/cloud-init.log and /var/log/cloud-init-output.log to ensure that it completed without error.

  2. verify that kubelet and aws-node processes are running on the ec2 nodes. Both should show up in ps

  3. check that /etc/eks/bootstrap.sh exists. Try invoking it as root with the arguments /etc/eks/bootstrap.sh --apiserver-endpoint '${endpoint}' --b64-cluster-ca '${cluster_ca_data}' '${cluster_name}' using the variables sourced from the EKS overview page in the AWS ui.

  4. check the aws-auth config map in kube-system and verify the ec2 role is mapped like this:

    mapRoles: |
      - rolearn: arn:aws:iam::<account id>:role/<node role>
        username: system:node:{{EC2PrivateDNSName}}
        groups:
          - system:bootstrappers
          - system:nodes

as without this the node will not be able to authenticate to the cluster.

When in doubt, try the newest version of the EKS ami for your cluster's kubernetes version - some AMIs are broken.

like image 98
mcfinnigan Avatar answered Dec 07 '25 19:12

mcfinnigan



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!