I'm using terraform to create mutiple ec2 nodes on aws:
resource "aws_instance" "myapp" {
count = "${var.count}"
ami = "${data.aws_ami.ubuntu.id}"
instance_type = "m4.large"
vpc_security_group_ids = ["${aws_security_group.myapp-security-group.id}"]
subnet_id = "${var.subnet_id}"
key_name = "${var.key_name}"
iam_instance_profile = "${aws_iam_instance_profile.myapp_instance_profile.id}"
connection {
user = "ubuntu"
private_key = "${file("${var.key_file_path}")}"
}
provisioner "remote-exec" {
inline = [
"sudo apt-get update",
"sudo apt-get upgrade -y",
"sudo apt-get install -f -y openjdk-7-jre-headless git awscli"
]
}
}
When I run this with say count=4, some nodes intermittently fail with apt-get errors like:
aws_instance.myapp.1 (remote-exec): E: Unable to locate package awscli
while the other 3 nodes found awscli just fine. Now all nodes are created from the same AMI, use the exact same provisioning commands, why would only some of them fail? The variation could potentially come from:
Which is more likely? Any other possibilities I'm missing?
Is there an apt-get "force" type flag I can use that will make the provisioning more repeatable?
The whole point of automating provisioning through scripts is to avoid this kind of variation between nodes :/
The remote-exec provisioner feature of Terraform just generates a shell script that is uploaded to the new instance and runs the commands you specify. Most likely you're actually running into problems with cloud-init which is configured to run on standard Ubuntu AMIs, and the provisioner is attempting to run while cloud-init is also running, so you're running into a timing/conflict.
You can make your script wait until after cloud-init has finished provisioning. cloud-init creates a file in /var/lib/cloud/instance/boot-finished, so you can put this inline with your provisioner:
until [[ -f /var/lib/cloud/instance/boot-finished ]]; do
sleep 1
done
Alternatively, you can take advantage of cloud-init and have it install arbitrary packages for you. You can specify user-data for your instance like so in Terraform (modified from your snippet above):
resource "aws_instance" "myapp" {
count = "${var.count}"
ami = "${data.aws_ami.ubuntu.id}"
instance_type = "m4.large"
vpc_security_group_ids = ["${aws_security_group.myapp-security-group.id}"]
subnet_id = "${var.subnet_id}"
key_name = "${var.key_name}"
iam_instance_profile = "${aws_iam_instance_profile.myapp_instance_profile.id}"
user_data = "${data.template_cloudinit_config.config.rendered}"
}
# Standard cloud-init stuff
data "template_cloudinit_config" "config" {
# I've
gzip = false
base64_encode = false
part {
content_type = "text/cloud-config"
content = <<EOF
packages:
- awscli
- git
- openjdk-7-headless
EOF
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With