What’s the best way to handle an Availability Zone failure behind an ALB?

0
0
Asked By CuriousCoder92 On

I'm curious about how to manage a situation where an entire Availability Zone (AZ) experiences a network outage. I'm using an Application Load Balancer (ALB), and my Route 53 alias points to this ALB, which returns IP addresses for multiple AZs. If my client doesn't implement circuit breaking or retries, will it keep failing on the inactive leg of the ALB until the client TTL expires? Then, there's a chance it could receive the same broken address when the TTL expires since the ALB won't update Route 53 dynamically. Are there any strategies to address this issue? Also, I believe the 'Evaluate Target Health' option on an Alias won't help here, given that it checks backend target health and not the ALB itself.

3 Answers

Answered By FaultToleranceGuru On

Yes, you're correct. The health checks can take a bit of time to relay failure information downstream, so clients should definitely have a retry mechanism and fallback plan in place if you're looking for fault tolerance.

Answered By ChaosExplorer22 On

Definitely check out Chaos Engineering along with the AWS Fault Injection Simulator. There's a workshop that dives into AZ disruptions and other failure scenarios. It could be helpful for your case!

NetworkNinja11 -

I feel you there, but most examples focus on backend services. My concern is more about the ALB connection failing. I haven't really found a way to simulate that without triggering a DNS update, which is precisely what I want to avoid.

Answered By NewFeatureFan On

Good news! There's a new feature that supports zonal shifting with cross-zone enabled ALBs now. It might be worth checking out for your situation!

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.