I'm looking to implement an AI Agent that can help troubleshoot various problems within my Kubernetes environment, like pods that keep restarting or missing kubelet metrics. The AI Agent won't be running inside the Kubernetes cluster, and the Kube API server is not accessible from outside. Additionally, it will only have read-only access to the data. I also can't use kubectl from outside the cluster, but the AI Agent can connect to applications within the cluster that are exposed through ingress. It should gather Kubernetes data only when I specifically request it. What would be the best architectural setup for this situation?
5 Answers
It might be a good idea to have your AI Agent designed to fetch data upon request. Maybe it can analyze the logs in real time and provide suggestions based on the gathered data. Just something to think about!
You could run a pod with a read-only service account that acts as a middleman. It would give your AI access to the API and maybe even communicate with a language model API to help process the data effectively.
Have you considered using some observability tools? Tools like Grafana and Victoria Metrics can provide useful insights into your cluster's state. If you have a GitOps setup (like ArgoCD), that could also help integrate changes smoothly. Just trying to gauge what existing context or tools you might already be using for your AI Agent.
There's a tool called kubectl mcp server that I found useful. You can set it to read-only mode. I haven't specifically used it for debugging yet, but it might be worth checking out for your use case.
Just a thought – sometimes diving into new tech can lead to more issues instead of solving them. Make sure the AI Agent is genuinely adding value to the debugging process, so you don’t end up with double the problems later on!

We usually deploy with helmfile, and we've got metrics and logs through Victoria Metrics and Grafana, so we do have some observability in place.