I'm working with a distributed egress model in our AWS setup, which means there's no default route pointing to the Transit Gateway (TGW) in our VPCs. Whenever we attach a new VPC to the TGW, I have to manually go into all existing VPCs' private subnets and add a route to the new VPC CIDR pointing to the local TGW attachment. This is manageable for a few VPCs, but as we scale, it's becoming error-prone and hard to maintain. I'm looking for a clean and scalable way to automate this process. While Terraform seems like a potential solution, it introduces complexities with cross-account access and role assumptions. I'd love to hear how others have tackled this automation at scale!
5 Answers
In our setup, we also use centralized egress through the Transit Gateway. We decided to develop a Terraform module that standardizes how VPCs are set up. This way, every new VPC follows the same rules, which helps minimize complications as we grow.
You could set up an EventBridge rule to listen for TGW attachment events. Then, have a Lambda function that assumes the necessary roles for each VPC to automatically update the route tables with the new VPC CIDR. You could keep your mapping info in SSM or DynamoDB to make management easier.
That’s a cool solution! Just make sure to account for Lambda limits if your setup scales significantly.
Using multiple AWS Providers in your Terraform setup can simplify this. You can have one for the local account and another that assumes a role in the central network account, which can help you manage route updates more efficiently.
Have you considered using VPC Lattice instead of TGW? It might help streamline your architecture in the long run.
One option is to use a broader entry like 10.0.0.0/8 for your routes. This way, you can have a default route pointing to the TGW without needing to constantly update it every time a new VPC is attached. I understand this might not fit all architectures, but it could simplify your setup a lot if you're managing centralized ingress and egress!
True! Having a default route can definitely reduce the need for updates. It simplifies things if you control the ingress/egress centrally.
Totally agree! Creating a module like that really can simplify maintaining your infrastructure.