How Can I Monitor AWS On-Demand Capacity Outages More Effectively?

0
17
Asked By WanderLust42 On

Hey everyone! I recently set up a solution that requires one g6.xlarge instance now and then. I thought a capacity outage lasting longer than a few hours was pretty unlikely, but we just experienced a 48-hour one in my region. This got me thinking about how often similar outages happen and how long they typically last, as I need to explain this situation to some higher-ups at my organization. I've asked our corporate AWS contact, but they don't provide any of this info.

Are there any third-party websites or tools out there to track AWS on-demand capacity outages? I'm having a hard time finding anything useful. I know about reserved instances and other options, but I'm specifically looking for data on on-demand capacity stats. It seems to me like it would be simple to set up a service to periodically launch an EC2 instance, check the capacity, and publish the results. Is there something I'm missing? Surely someone is already doing this? Thanks!

3 Answers

Answered By CapacityGuru On

We faced a similar challenge with g6 instances, so we created a simple monitoring tool. We wrote a script that checks instance availability every 10 minutes, tagging those instances for easy tracking, and then terminates them immediately. We use CloudWatch to track any capacity-related errors, which has helped a lot.

I don’t know of any third-party services that offer this monitoring, but your idea is solid. We built something similar and are now using pointfive to analyze capacity failures relative to our spending patterns, and we found that capacity issues often align with our peak spending times.

DataDiver22 -

Thanks for sharing your approach! It’s always nice to get insights from someone in the trenches.

Answered By TechSavvyGuru On

You might want to look into purchasing a Capacity Reservation to guarantee availability for your g6.xlarge. But I'm curious, do you really need that specific instance type? Could you possibly use a different type instead?

GadgetGeek99 -

I get what you're saying, but spending more isn’t always the answer, especially if you're working with a small team facing a long to-do list. Figuring out a new setup sounds great, but we might just need more information about these outages instead.

Answered By ExAWSPro On

From my experience at AWS, these capacity issues can change literally every second. They don't keep extra spare capacity around, so sometimes customers just miss out. Remember that availability varies per Availability Zone (AZ) and each instance type has its own capacity pool.

Instead of trying to monitor instances, it might be better to design your workload to not depend on specific node types. If you're tied to that instance, at least consider using a variety of instance types or run across different AZs or regions. If you need that instance often, consider getting a Reserved Instance (RI) since they tend to prioritize those customers in need. Good luck!

CloudWatcher88 -

That’s good advice, but knowing outage frequencies would really help too. Just knowing you can or can’t start an instance isn't enough when I need to make decisions about what to use.

DevOpsDynamo -

Also, for monitoring capacity, you can check spot prices. Higher prices generally indicate lower availability.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.