I'm dealing with a challenge in our AWS multi-account setup where we use a distributed egress model. This means that none of our VPCs have a default route (0.0.0.0/0) pointing to the Transit Gateway (TGW). Every time we attach a new VPC to the TGW, I'm stuck manually updating the private subnets in all existing VPCs to add a route to the new VPC's CIDR, pointing to that VPC's local TGW attachment. With the growth of our accounts and VPCs, this process is becoming unmanageable and prone to errors. So, I'm on the lookout for a cleaner, automated solution to handle this. Terraform seems like a good option, but I run into issues with cross-account access and role assumptions, and it gets confusing with a large number of accounts. Have anyone of you found a more elegant solution to this problem?
4 Answers
A more automated approach you could take is to set up an EventBridge rule to catch TGW attachment events. From there, you could use a Lambda function that assumes roles in each account to automatically update route tables in those private subnets with the new VPC's CIDR and route it to the right TGW attachment. Keeping the mapping in SSM or DynamoDB can help centralize your data.
We handle this complexity by using a centralized egress model with Terraform. We developed a very specific Terraform module for setting up VPCs, which means routing to the internet is straightforward for us. It does help minimize confusion on route updates as you scale up your infrastructure.
For sure, creating a tailored Terraform module is a smart move!
Have you thought about using a wide CIDR block like 10.0.0.0/8? That way, all your VPCs can have a default route into the TGW without needing constant updates. It’s simpler to manage this way, especially if you centralize your ingress and egress traffic. But I get that it might not be feasible for everyone depending on their setup.
Totally, that’s a viable option! Just keep in mind that specific routes within the VPC would take precedence, so you could still have the flexibility you need.
You can actually manage multiple AWS providers within your Terraform setup! One for your local account and another one that uses a role in your central network account. When you create your VPC, you can also set up the Transit Gateway attachment using the local provider and create transit gateway route table entries with the network account provider. I recently implemented this and it worked great!
That's a clever strategy! Using multiple providers makes it much more manageable.
I wouldn't go down that road. It might just get too complicated!