Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Terraform - why this is not causing circular dependency?

Tags:

terraform

Terraform registry AWS VPC example terraform-aws-vpc/examples/complete-vpc/main.tf has the code below which seems to me a circular dependency.

data "aws_security_group" "default" {
  name   = "default"
  vpc_id = module.vpc.vpc_id
}

module "vpc" {
  source = "../../"

  name = "complete-example"

...
 # VPC endpoint for SSM
  enable_ssm_endpoint              = true
  ssm_endpoint_private_dns_enabled = true
  ssm_endpoint_security_group_ids  = [data.aws_security_group.default.id] # <----- 

...

data.aws_security_group.default refers to "module.vpc.vpc_id" and module.vpc refers to "data.aws_security_group.default.id".

Please explain why this does not cause an error and how come module.vpc can refer to data.aws_security_group.default.id?

like image 882
mon Avatar asked Feb 22 '20 02:02

mon


People also ask

How do you fix a circular dependency problem?

There are a couple of options to get rid of circular dependencies. For a longer chain, A -> B -> C -> D -> A , if one of the references is removed (for instance, the D -> A reference), the cyclic reference pattern is broken, as well. For simpler patterns, such as A -> B -> A , refactoring may be necessary.

What causes circular dependency?

A circular dependency occurs when two classes depend on each other. For example, class A needs class B, and class B also needs class A. Circular dependencies can arise in Nest between modules and between providers. While circular dependencies should be avoided where possible, you can't always do so.

How can circular dependencies be avoided?

Circular dependencies can be introduced when implementing callback functionality. This can be avoided by applying design patterns like the observer pattern.

What is circular dependency error?

When you see the circular dependency detected error displayed in your Google spreadsheet, this means that your formula is referring to a range that contains the formula itself, or in other words when the formula input, is dependent on the output.


1 Answers

In the Terraform language, a module creates a separate namespace but it is not a node in the dependency graph. Instead, each of the module's Input Variables and Output Values are separate nodes in the dependency graph.

For that reason, this configuration contains the following dependencies:

  • The data.aws_security_group.default resource depends on module.vpc.vpc_id, which is specifically the output "vpc_id" block in that module, not the module as a whole.
  • The vpc module's variable "ssm_endpoint_security_group_ids" variable depends on the data.aws_security_group.default resource.

We can't see the inside of the vpc module in your question here, but the above is okay as long as there is no dependency connection between output "vpc_id" and variable "ssm_endpoint_security_group_ids" inside the module.

I'm assuming that such a connection does not exist, and so the evaluation order of objects here would be something like this:

  • aws_vpc.example in module.vpc is created (I just made up a name for this because it's not included in your question)
  • The output "vpc_id" in module.vpc is evaluated, referring to module.vpc.aws_vpc.example, and producing module.vpc.vpc_id.
  • data.aws_security_group.default in the root module is read, using the value of module.vpc.vpc_id.
  • The variable "ssm_endpoint_security_group_ids" for module.vpc is evaluated, referring to data.aws_security_group.default.
  • aws_vpc_endpoint.example in module.vpc is created, including a reference to var.ssm_endpoint_security_group_ids.

Notice that in all of the above I'm talking about objects in modules, not modules themselves. The modules serve only to create separate namespaces for objects, and then the separate objects themselves (which includes individual variable and output blocks) are what participate in the dependency graph.


Normally this design detail isn't visible: Terraform normally just uses it to potentially optimize concurrency by beginning work on part of a module before the whole module is ready to process. In some interesting cases like this though, you can also intentionally exploit this design so that an operation for the calling module can be explicitly sandwiched between two operations for the child module.

Another reason why we might make use of this capability is when two modules naturally depend on one another, such as in an experimental module I built that hides some of the tricky details of setting up VPC peering connections:

locals {
  vpc_nets = {
    us-west-2 = module.vpc_usw2
    us-east-1 = module.vpc_use1
  }
}

module "peering_usw2" {
  source = "../../modules/peering-mesh"

  region_vpc_networks = local.vpc_nets
  other_region_connections = {
    us-east-1 = module.peering_use1.outgoing_connection_ids
  }

  providers = {
    aws = aws.usw2
  }
}

module "peering_use1" {
  source = "../../modules/peering-mesh"

  region_vpc_networks = local.vpc_nets
  other_region_connections = {
    us-west-2 = module.peering_usw2.outgoing_connection_ids
  }

  providers = {
    aws = aws.use1
  }
}

(the above is just a relevant snippet from an example in the module repository.)

In the above case, the peering-mesh module is carefully designed to allow this mutual referencing, internally deciding for each pair of regional VPCs which one will be the peering initiator and which one will be the peering accepter. The outgoing_connection_ids output refers only to the aws_vpc_peering_connection resource and the aws_vpc_peering_connection_accepter refers only to var.other_region_connections, and so the result is a bunch of concurrent operations to create aws_vpc_peering_connection resources, followed by a bunch of concurrent operations to create aws_vpc_peering_connection_accepter resources.

like image 194
Martin Atkins Avatar answered Sep 20 '22 17:09

Martin Atkins