Building a Dynamic AWS Security Group Solution With CSV in Terraform
Learn step-by-step how to build a solution to dynamically manage AWS security groups in Terraform using a CSV file.
Introduction
In my recent blog post about effective AWS security group management in Terraform, I delved into valuable tips based on my experiences. This exploration reignited my interest for a shelved side project from a past DevOps engagement.
The concept revolves around using a comma-separated value (CSV) file to manage security group settings, which offers a streamlined approach to deploying security groups via Terraform. This not only facilitates a centralized GitOps deployment model but also caters to security analysts less familiar with Terraform, simplifying their operational workflow.
Over the weekend, I spent some time to design and develop this solution. The journey was not without its challenges which required some creative problem-solving and experimentation. As I believe fellow DevOps engineers can benefit from this approach, it motivated me to document and share my insights in this blog post.
To enhance the post's readability, I'll explain concepts using code snippets. You'll find the fully runnable Terraform configurations in the accompanying GitHub repository, with each step/section conveniently highlighted. Ready to embark on this mini journey towards a CSV-based solution for security group management? Let's dive right into the design and implementation!
Defining the solution requirements
The main goal of the solution is to provide a simpler way of managing AWS security groups (and specifically the rules) with a CSV file instead of Terraform variables and configuration. Here are some specific requirements to ensure flexibility of the solution:
Use a single CSV file to manage all security groups of the Terraform stack.
Support for all source/destination types - IPv4 CIDRs, IPv6 CIDRs, prefix lists, and security groups.
Support for dynamic values (for example, using the ID from a security group resource that is provisioned in the same Terraform configuration).
For the purpose of explaining the solution, let's consider a target workload that is a traditional three-tier Linux, Apache, MySQL, PHP (LAMP) web application running on AWS with the following architecture:
To keep things simple, each security group for the resources (ALB, web server, MySQL instance) will have an egress rule that allows all outbound traffic. As for ingress rules, the requirements are as follows:
Resource | Source | Protocol and port |
ALB | Internet (0.0.0.0/0) | TCP 443 (HTTPS) |
Web server | ALB (10.0.0.0/24) | TCP 80 (HTTP) |
MySQL instance | Web server (10.0.1.0/24) | TCP 3306 (MySQL) |
As we develop the Terraform configuration in this blog post, we will focus only on creating the security group resources and a VPC resource which the security groups can be associated with. If you are interested in seeing a full solution in action, feel free to add the configuration to provision the network and workload resources at your leisure.
Defining the security group rule CSV file format
As per the requirements, the CSV file must support both ingress and egress rule definitions of all types for resources in the stack. The file format must adhere to the RFC 4180 specification, which is required by the Terraform csvdecode
function that we will use in our Terraform configuration later. Considering the required attributes for the aws_vpc_security_group_ingress_rule
and aws_vpc_security_group_egress_rule
resources, the CSV file schema can be defined as follows:
Column | Description | Example |
resource_name | The logical name of the resource to which the security group rules apply to. | web |
type | One of: ingress , egress | ingress |
name | The name of the rule. | http-alb |
description | The description of the rule. | Allow HTTP access from the public subnet CIDRs |
cidr_ipv4 | The source or destination IPv4 CIDR range. | 10.0.0.0/24 |
cidr_ipv6 | The source or destination IPv6 CIDR range. | 2001:db8:: |
prefix_list_id | The ID of the source or destination prefix list. | pl-07b7b831714d4596a |
referenced_security_group_id | The source or destination security group that is referenced in the rule. | sg-0023839dc98251128 |
ip_protocol | The IP protocol name or number. | -1 (all protocols), tcp |
from_port | The start of port range for the TCP and UDP protocols, or an ICMP/ICMPv6 type. | 443 |
to_port | The end of port range for the TCP and UDP protocols, or an ICMP/ICMPv6 code. | 443 |
Since a security group rule expects only one of the four source or destination types, three of them would be optional for each rule. In the CSV file, we will leave those values empty, which will resolve to an empty string as you will see later.
To test the solution, let's define a file called sg-rules.csv
with the following content that specifies all required rules for the ALB, web server, and MySQL instance. We will also start simple and use static values, specifically the subnet CIDR ranges, for the inbound rule sources.
resource_name,type,name,description,cidr_ipv4,cidr_ipv6,prefix_list_id,referenced_security_group_id,ip_protocol,from_port,to_port
db,ingress,postgres-web,Allow MySQL access from the private (web) subnet CIDRs,10.0.1.0/24,,,,tcp,3306,3306
db,egress,all,Allow all outgoing traffic,0.0.0.0/0,,,,-1,-1,-1
web,ingress,http-public,Allow HTTP access from the public subnet CIDRs,10.0.0.0/24,,,,tcp,80,80
web,egress,all,Allow all outgoing traffic,0.0.0.0/0,,,,-1,-1,-1
alb,ingress,https-all,Allow HTTPS access from the the internet,0.0.0.0/0,,,,tcp,443,443
alb,egress,all,Allow all outgoing traffic,0.0.0.0/0,,,,-1,-1,-1
Developing the basic Terraform configuration
Now that we have defined the CSV file format, let's write the Terraform configuration. For the base resource definitions, we will use something similar to what is explained in tip #2 of my previous blog post, for example:
# TODO: Adapt this to CSV input
resource "aws_vpc_security_group_ingress_rule" "web" {
for_each = var.web_security_group_rules.ingress
security_group_id = aws_security_group.web.id
cidr_ipv4 = try(each.value.cidr_ipv4, null)
cidr_ipv6 = try(each.value.cidr_ipv6, null)
prefix_list_id = try(each.value.prefix_list_id, null)
referenced_security_group_id = try(each.value.referenced_security_group_id, null)
from_port = each.value.from_port
ip_protocol = each.value.ip_protocol
to_port = each.value.to_port
}
We will need to load the CSV file and use its content, which can be done using the csvdecode
function and the file
function in a local value. The loaded content will be a list of objects each representing a row in the CSV file. For column values that are not specified, they will be loaded as an empty string which we will need to convert to null
in some arguments. Now we need to adapt it to a map which is easier to supply to the for_each
meta-argument:
locals {
sg_rules_csv = csvdecode(file("${path.module}/sg-rules.csv"))
sg_rules = { for e in local.sg_rules_csv : "${e.resource_name}-${e.type}-${e.name}" => e }
}
To ensure uniqueness, we will use a combination of the resource name, rule type, and rule name as the map key. With a friendlier structure to for_each
, let's update the rule resource definition to create rules based on the map. Here is an example of the web server security group ingress rule resources:
resource "aws_vpc_security_group_ingress_rule" "web" {
for_each = { for k,v in local.sg_rules : "${v.name}" => v if v.resource_name == "web" && v.type == "ingress" }
security_group_id = aws_security_group.db.id
cidr_ipv4 = try(each.value.cidr_ipv4 != "" ? each.value.cidr_ipv4 : null, null)
cidr_ipv6 = try(each.value.cidr_ipv6 != "" ? each.value.cidr_ipv6 : null, null)
prefix_list_id = try(each.value.prefix_list_id != "" ? each.value.prefix_list_id : null, null)
referenced_security_group_id = try(each.value.referenced_security_group_id != "" ? each.value.referenced_security_group_id : null, null)
from_port = each.value.from_port
ip_protocol = each.value.ip_protocol
to_port = each.value.to_port
}
The value of for_each
is the result of a for
loop that filters for the ingress rules relevant to the web server. The destination attribute values are also updated to check for empty string and set the value to null
.
basic
directory of the GitHub repository that accompanies this blog post.Now you can apply the Terraform configuration and see that it completes successfully. For good measure, verify in the AWS Management Console that the security groups are created with the correct set of rules (particularly the inbound rules that refer to the subnet CIDRs).
Adding variable support - first attempt
While referring to subnet CIDRs works, it does not offer the best security following the least privilege principle. For instance, other future workloads that are deployed to the web private subnet may be able to access the MySQL instance. As an improvement, the destination of the existing security group ingress rules should instead point to the appropriate workload security group.
Since the security groups are provisioned when the Terraform configuration is applied, their IDs are only known after they are created. It is certainly not desirable to manually copy the IDs into the CSV file afterwards, so we need to find a way to dynamically inject them. To address this, we can consider employing variable substitution.
As many Terraform practitioner knows, there is a templatefile
function that can read a file while replacing template variables in the file content. (There is also a template_file
data source but it is now considered deprecated.) Let's update the CSV file to use template variables to inject security group IDs in runtime like so:
resource_name,type,name,description,cidr_ipv4,cidr_ipv6,prefix_list_id,referenced_security_group_id,ip_protocol,from_port,to_port
db,ingress,postgres-web,Allow MySQL access from the private (web) subnet CIDRs,,,,${web_sg_id},tcp,3306,3306
db,egress,all,Allow all outgoing traffic,0.0.0.0/0,,,,-1,-1,-1
web,ingress,http-public,Allow HTTP access from the public subnet CIDRs,,,,${alb_sg_id},tcp,80,80
web,egress,all,Allow all outgoing traffic,0.0.0.0/0,,,,-1,-1,-1
alb,ingress,https-all,Allow HTTPS access from the the internet,0.0.0.0/0,,,,tcp,443,443
alb,egress,all,Allow all outgoing traffic,0.0.0.0/0,,,,-1,-1,-1
We also need to update the local value that loads the CSV file as follows:
locals {
sg_rules_csv = csvdecode(templatefile("${path.module}/sg-rules.csv", {
"alb_sg_id" = aws_security_group.alb.id
"web_sg_id" = aws_security_group.web.id
}))
sg_rules = { for e in local.sg_rules_csv : "${e.resource_name}-${e.type}-${e.name}" => e }
}
dynamic_attempt_1
directory of the GitHub repository that accompanies this blog post.All is seemingly well, however when we apply the Terraform configuration, it fails with a few errors similar to the one below:
Error: Invalid for_each argument
│
│ on main.tf line 38, in resource "aws_vpc_security_group_ingress_rule" "db":
│ 38: for_each = { for k, v in local.sg_rules : "${v.name}" => v if v.resource_name == "db" && v.type == "ingress" }
│ ├────────────────
│ │ local.sg_rules will be known only after apply
│
│ The "for_each" map includes keys derived from resource attributes that cannot be determined until apply, and so Terraform cannot determine the full set of
│ keys that will identify the instances of this resource.
│
│ When working with unknown values in for_each, it's better to define the map keys statically in your configuration and place apply-time results only in the
│ map values.
│
│ Alternatively, you could use the -target planning option to first apply only the resources that the for_each value depends on, and then apply a second time
│ to fully converge.
So what exactly are these errors about, and how do we fix them?
Fixing the for_each key issue and finalizing the solution
As the error message explained, for_each
requires that the map keys be known during plan time. In fact, it is a frustratingly common problem that many Terraform practitioners have encountered. The limitation is also explained in the for_each
documentation.
The problem is that the sg_rules
map is derived from the sg_rules_csv
local value, which is loaded using the templatefile
function with template variable replacement. Due to the replacement, the loaded CSV file content is no longer considered static (for instance, I could substitute entire rows into the content). Although to us, it should have been fair game because we are technically only replacing some values in a row instead of replacing entire rows.
Since we are sure that most of the contents, particularly the fields that comprises the key for sg_rules
, we can build a static list off of it for looping while using the dynamically loaded CSV file content for all other information. For this, we will need some new local values:
locals {
sg_rules_csv = csvdecode(templatefile("${path.module}/sg-rules.csv", {
"web_sg_id" = aws_security_group.web.id
"alb_sg_id" = aws_security_group.alb.id
}))
sg_rule_names = [for e in csvdecode(file("${path.module}/sg-rules.csv")) : "${e.resource_name}-${e.type}-${e.name}"]
sg_rules = { for e in local.sg_rules_csv : "${e.resource_name}-${e.type}-${e.name}" => e }
}
Notice that there is now a new list called sg_rule_names
, which contains the map key names, using the CSV file content loaded using the vanilla file
function with has no variable substitution. Meanwhile, we keep the sg_rule_csv
and sg_rules
values the same. What's important is that sg_rule_names
must contain the exact list of keys in sg_rules
as we have coded.
We can now update the for_each
value in the rule resources to iterate using the static list of keys in local.sg_rule_names
while fetching rule settings from local.sg_rules
below:
resource "aws_vpc_security_group_ingress_rule" "web" {
for_each = { for k in local.sg_rule_names : k => local.sg_rules[k] if startswith(k, "web-ingress") }
security_group_id = aws_security_group.web.id
cidr_ipv4 = try(each.value.cidr_ipv4 != "" ? each.value.cidr_ipv4 : null, null)
cidr_ipv6 = try(each.value.cidr_ipv6 != "" ? each.value.cidr_ipv6 : null, null)
prefix_list_id = try(each.value.prefix_list_id != "" ? each.value.prefix_list_id : null, null)
referenced_security_group_id = try(each.value.referenced_security_group_id != "" ? each.value.referenced_security_group_id : null, null)
from_port = each.value.from_port
ip_protocol = each.value.ip_protocol
to_port = each.value.to_port
}
Note that we also need to use static values in the if
condition in the for_each
loop instead of attributes in the map values. While is it not the most elegant solution, the hardcoding is still passible because we know specifically the resource and type to which the resource block is applicable.
final
directory of the GitHub repository that accompanies this blog post.With the updated configuration, the Terraform configuration can now be applied successfully! Please make sure that you verify the security groups and their rules in the AWS Management Console.
Maintaining the solution
In terms of maintenance, whenever you need to refer to new values that are derived from new resources such as:
Subnet CIDR blocks (for example,
aws_subnet.private.cidr_block
)Managed prefix lists (for example,
aws_ec2_managed_prefix_list.office_vpn.id
)Security groups (for example,
aws_security_group.msk.id
)
You will need to do the following, while paying attention to use unique IDs:
Define new template replacement variables (for example,
${subnet_private_cidr}
,${office_vpn_prefix_list_id}
, and${msk_sg_id}
per above).Update the list of variables in the
tempatefile
argument of thesg_rules_csv
local value.
Summary
Congratulations, you have just built a CSV-based solutions for managing AWS security groups using Terraform! By employing a sensible design and naming scheme, as well as thoughtfully using Terraform functions and constructs, all rule settings can be maintained in a single CSV file. The same design can be extended to other cloud provider's corresponding concepts, such as network security groups in Azure.
If you find this solution helpful, please check out my other blog posts or let me know what you'd like to learn more about!