Hey everyone! I've been assigned a project at work where I need to set up real-time monitoring for all of our EC2 instances. We need to track CPU and memory usage, as well as any system check failures. The challenge is that we have two AWS organizations with multiple child accounts, and over 150 instances scattered across different regions. I've tried setting up CloudWatch alarms, but ran into the snag that SNS topics and CloudWatch alarms must be in the same region, which means I'd need to create a ton of SNS topics. I'm hoping someone here has a simpler solution to achieve this with custom formatted alerts.
3 Answers
Have you thought about using Prometheus and Grafana? It's free, open-source, and gives you a one-time setup that works across clouds and regions. Definitely worth considering!
You might want to consider creating a centralized monitoring account using CloudWatch to gather metrics from all your different accounts. If you need custom alert notifications, you could implement a Lambda layer to handle that. Also, keep in mind that there are third-party monitoring tools that could help, like Nagios, Icinga, or Zabbix if your restrictions allow it.
It can be pricey, but have a look at Datadog. It’s a solid option and easy to set up if your budget allows.
Thanks for the suggestion! Unfortunately, they want to stick with AWS only, no third-party tools allowed.