System Operations

Is Cilium the Right Choice for Our Networking Needs?

May 11, 2025

Asked By DevOpsNinja123 On May 11, 2025

Hey everyone,

I'm part of a startup focused on networking, and right now it's just me and one other DevOps engineer managing our tech stack. We currently have Grafana set up on Kubernetes for monitoring purposes, but I'm keen to start digging into network metrics since I've noticed some of our pods could scale better based on open connections instead of just CPU usage.

I've been exploring KEDA and KNative for scaling based on these metrics, but I'm curious if Cilium could provide even more benefits. I'm looking for more insights into network observability and alerting—like catching when NGINX throws 500 errors—and scaling based on those metrics.

My questions are:
1. Is Cilium the right tool for what I want? Or could I just keep things simpler with KEDA or KNative? I'm mainly interested in monitoring network performance, setting alerts, etc.
2. If Cilium is suitable, can we implement it gradually, or do we need to fully commit from the start? Both of us are fairly new to working with a network mesh and advanced CNI features, and my colleague doesn't have experience here as he's more focused on AWS cloud stuff.
3. Can Cilium integrate with Grafana? We're currently using the LGTM stack with k8s-monitoring, utilizing Grafana Alloy.

Thanks for any advice!

2 Answers

Answered By SimplifyFirst On May 13, 2025

Honestly, I would steer clear of adding Cilium complexity for now, especially with just two of you on the team. Using open connections as a scaling metric is generally not reliable because they tend to be really dynamic. In my experience, many have tried using metrics like active HTTP connections for scaling, and it usually backfires because connections are often too short-lived. If your processes are long-running, consider using a queue for better results instead.

ScalingExpert99 - May 14, 2025

Totally agree! CPU has always been my go-to for scaling decisions.

Answered By NetworkGuru81 On May 12, 2025

You're a bit off track here. KEDA is an external autoscaler and won't know what’s going on unless it pulls metrics from an external source. It allows autoscaling based on metrics outside of CPU and memory, which is great, but it doesn't handle everything on its own.

For some of your needs, you could use NGINX to export metrics directly to your Grafana setup. Cilium can also provide those metrics, especially if you use it in 'chaining mode' over the AWS VPC CNI. Also, consider using tools like Retina for additional insights.

Is Cilium the Right Choice for Our Networking Needs?

2 Answers

Related Questions

Can't Load PhpMyadmin On After Server Update

Redirect www to non-www in Apache Conf

How To Check If Your SSL Cert Is SHA 1

Windows TrackPad Gestures

LEAVE A REPLY Cancel reply