Hey everyone, I've been assigned a project at work to set up real-time monitoring for over 150 EC2 instances across two different AWS organizations and multiple child accounts. I'm looking to track CPU and memory utilization along with system check failures. I've tried using CloudWatch alarms, but I hit a snag since the SNS topics and alarms need to be in the same region. With instances spread across various regions, creating numerous SNS topics seems impractical. I'm hoping to find a simpler solution that still meets our requirement for custom formatted alerts. Any suggestions?
3 Answers
You could consider using Prometheus and Grafana. It’s a free, open-source solution that requires a one-time setup, is cloud and region agnostic, and could save some headaches down the line!
If you're open to it, you might check out Datadog. While it's a bit on the pricier side, many users find it the easiest option for setting up comprehensive monitoring across multiple accounts.
You could create a centralized monitoring account with CloudWatch to gather metrics from all your different accounts. To handle custom alert notifications, consider adding a Lambda layer on top. If sticking strictly to AWS is mandatory, you might not want to look at third-party tools, but they're really handy!
Thanks for the tip! However, they specifically requested that we only use AWS services, no third-party tools allowed.