I run an uptime monitoring service and collect check results (HTTP, TCP, UDP, ICMP) every two minutes from various regions. Each check creates a new row in BigQuery, which has been fine until now, but it's starting to create a significant bill as my table grows rapidly. Real users accessing dashboards that analyze months of data will escalate costs due to the on-demand pricing structure. Queries like "last 30 days of this one check" are particularly costly despite using daily partitions and clustering.
Writes are coming in at thousands of rows per second from a VPS in Frankfurt, while reads need to be instant for dashboard and API purposes, including timelines, uptime percentages, and latency metrics. I'm exploring alternatives like ClickHouse and TimescaleDB, but I'd like feedback from anyone who has transitioned off BigQuery. Did your costs go down, or did you just swap one set of challenges for another? Also, is there a hybrid model that combines aspects of both?
4 Answers
I believe ClickHouse could be a strong alternative. I made a similar shift from BigQuery to ClickHouse at my previous job for its efficiency with percentile queries over arbitrary ranges. The key is to use AggregatingMergeTree for your common time frames and let materialized views handle the rollup automatically. Raw data is available for drill-downs while reducing the cost of frequent dashboard queries.
I don't have specific solutions for you, but I'm intrigued why you're not considering a traditional database setup on a dedicated VPS or similar? Sometimes, that could be a simpler and more cost-effective route.
Have you thought about using a time-series database like InfluxDB or TimescaleDB? They're built for high write volumes and could potentially save you money while still allowing quick access for your dashboards. Plus, you can still aggregate older data into BigQuery if you need it for deeper analysis.
You might want to implement a daily cleanup process to condense your results into fewer rows. For example, combining the data from the last 24 hours into a single row could save you some costs. But I know what you mean about certain queries needing raw data, especially for incident tracking. Those can get pricey! Redis won't really help with checking historical data, but if you're rolling up daily, that's a great start.

Related Questions
How to Build a Custom GPT Journalist That Posts Directly to WordPress
Cloudflare Origin SSL Certificate Setup Guide
How To Effectively Monetize A Site With Ads