I've been really impressed with the automatic CloudWatch metrics and dashboards that come with AWS services, especially when deploying a Lambda function. It's awesome to immediately track traffic, success rates, latency, and concurrency. However, we're running a multi-tenant platform on AWS, and it would be incredibly beneficial if we could break down these metrics by customer ID for better observability. This would allow us to monitor traffic for specific customers, debug issues, and set up alerts when something goes wrong.
To achieve this, we could emit our own custom CloudWatch metrics using the service endpoint and customer ID as dimensions. But here's the kicker: AWS charges $0.30/month for each custom metric defined by a unique combination of dimensions. When we think about the number of metrics we want to emit (success, errors, latency, etc.) across different endpoints and customers, the costs can skyrocket quickly.
Tools like Prometheus seem to handle this kind of workload without breaking the bank. I'm starting to think about using Prometheus alongside separate Grafana dashboards for detailed customer metrics.
So, am I off base for thinking CloudWatch's pricing seems outrageous? How have others approached the challenge of custom metrics in their AWS setups?
5 Answers
For your specific case, consider using Embedded Metric Format (EMF) and Contributor Insights. It helps categorize your data differently instead of being tied to high-cardinality dimensions that rack up costs.
That's a solid point about costing. Remember that the $0.30 is based on the ingestion hour for metrics. Custom metrics can be a strain, so it's smart to only track what you need. If there's no action or alarm linked to a metric, maybe consider just storing that data in S3 and fetching it when necessary instead. It can save some money in the long run!
I get that it's pro-rated, but we expect continuous traffic. Waiting for batch updates could sacrifice real-time insights. What do you think?
Honestly, I faced significant CloudWatch bills, especially when we ramped up traffic. AWS pricing for this stuff should come under review. I’ve explored self-hosted alternatives for more flexibility, but it’s a trade-off. Just be cautious with enabling features like EKS audit, they can surprise you with costs!
Yikes, I’ll keep that in mind. Costs can escalate quickly!
You're definitely not alone in this! We also use Prometheus alongside CloudWatch and combine everything in Grafana. It works pretty well for us. Having that flexibility is key, especially as costs can add up with CloudWatch's pricing model.
Are you using a managed Grafana solution or self-hosting it?
Thanks! I'm leaning more towards Prometheus now.
CloudWatch gets pricey fast! Have you considered alternatives like specialized observability tools that might be less costly? I’ve had good conversations with startups in that space too, like last9.io, who are focused on these exact problems.
I checked their site, and I can't seem to find any clear pricing. Just a free tier!
But doesn’t EMF still incur costs? And suggesting not using high-cardinality dimensions feels a bit limiting, especially if clients face unique issues.