How Can I Store Billions of Uptime Check Results Without Breaking the Bank?

0
3
Asked By TechSavvyFox On

I run an uptime monitoring service and collect check results (HTTP, TCP, UDP, ICMP) every two minutes from various regions. Each check creates a new row in BigQuery, which has been fine until now, but it's starting to create a significant bill as my table grows rapidly. Real users accessing dashboards that analyze months of data will escalate costs due to the on-demand pricing structure. Queries like "last 30 days of this one check" are particularly costly despite using daily partitions and clustering.

Writes are coming in at thousands of rows per second from a VPS in Frankfurt, while reads need to be instant for dashboard and API purposes, including timelines, uptime percentages, and latency metrics. I'm exploring alternatives like ClickHouse and TimescaleDB, but I'd like feedback from anyone who has transitioned off BigQuery. Did your costs go down, or did you just swap one set of challenges for another? Also, is there a hybrid model that combines aspects of both?

4 Answers

Answered By QueryMaster3000 On

I believe ClickHouse could be a strong alternative. I made a similar shift from BigQuery to ClickHouse at my previous job for its efficiency with percentile queries over arbitrary ranges. The key is to use AggregatingMergeTree for your common time frames and let materialized views handle the rollup automatically. Raw data is available for drill-downs while reducing the cost of frequent dashboard queries.

Answered By CuriousCoder101 On

I don't have specific solutions for you, but I'm intrigued why you're not considering a traditional database setup on a dedicated VPS or similar? Sometimes, that could be a simpler and more cost-effective route.

Answered By CloudNomad On

Have you thought about using a time-series database like InfluxDB or TimescaleDB? They're built for high write volumes and could potentially save you money while still allowing quick access for your dashboards. Plus, you can still aggregate older data into BigQuery if you need it for deeper analysis.

Answered By DataNinja82 On

You might want to implement a daily cleanup process to condense your results into fewer rows. For example, combining the data from the last 24 hours into a single row could save you some costs. But I know what you mean about certain queries needing raw data, especially for incident tracking. Those can get pricey! Redis won't really help with checking historical data, but if you're rolling up daily, that's a great start.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.