Hey everyone! I currently have two clusters set up with Rancher, and I'm using ArgoCD in combination with GitLab. I've deployed Prometheus and Grafana through the kube-prometheus-stack, and it's all working great for my first cluster. I'm looking to centralize the monitoring across both clusters. Can anyone share how to add the second cluster? I'd love an easy tutorial to manage metrics and dashboards for any new clusters I might set up in the future. Also, are there any prebuilt stacks I can leverage for my monitoring needs? Just a note that everything is on-premise.
2 Answers
You can use Thanos as a global federation layer for your Prometheus and Grafana setup. It helps consolidate metrics from multiple clusters into a single view. Essentially, you'd install kube-prometheus-stack in each cluster, then use Thanos Sidecar to ship the metrics to a central cluster where Grafana can pull the data from. Make sure to label your metrics with the cluster name, like cluster=prod, so it's easier to manage in Grafana.
Another option is to install kube-prometheus-stack across all clusters and use Thanos. You configure Grafana only in the central cluster, which connects to Thanos as its data source. This setup allows you to create dynamic dashboards that can filter by cluster/environment using the labels you set up.

Just a heads up, using Thanos Sidecar requires each Prometheus instance to communicate properly, which can get tricky if you don't have direct connections between clusters. If possible, using a Thanos receiver might simplify this for you.