What’s the Best Tool to Gather Logs and Resource States from Ephemeral Kubernetes Clusters?

0
8
Asked By CloudyPineapple93 On

I'm looking for a robust solution to collect logs from all my Kubernetes pods, including the previous runs and the states of API resources and events. This is especially important when my runs fail in an ephemeral cluster, like those used in CI pipelines. While I can create a wrapper around several kubectl commands using bash or Python, I'm curious if there's a dedicated tool that can simplify this process and capture 'everything' I might need.

5 Answers

Answered By CloudBuilder99 On

When deploying, you’ll need to actively track resources like pods, events, and custom resource statuses on your own to get the full picture. Alternatively, consider using a GitOps approach to separate these concerns; tools like ArgoCD can help with managing your deployments this way.

EphhemeralExpert -

ArgoCD is implemented pretty far down the line, but I've encountered issues like hitting the limit on the number of new certificates with LE, which caused teleport to fail. It’s frustrating when you can't capture those errors in CI runs.

Answered By TechieTurtle22 On

You might want to try using kube-prometheus-stack with Loki and Promtail (now called Alloy). It's pretty comprehensive for monitoring and logging, but keep in mind that if your deployments fail, you might not have access to logs from those failed states since the environment is ephemeral.

SwiftSparrow18 -

What happens if you can't deploy and need logs for CI on an ephemeral environment that's already gone when you're looking for a failed job? I need something more reliable, like kubectl.

Answered By DevMaster3000 On

If you're looking to capture all cluster state information, logs, and even logs from previous pods, consider tools like kubectl-trace or kubectl-debug. However, when things go bad, I often find myself piecing together kubectl commands anyway. Also, I've heard CubeAPM is getting popular for observability, but I haven’t tested it for capturing state in a cluster yet—a bit curious if others have managed that.

CuriousCoder45 -

I just want to save the state of the cluster before it gets destroyed. Logging and observability should ideally happen after my code runs. If a foundational component like Longhorn fails during deploy, I need a way to keep that artifact in my CI for the failed job.

Answered By LoggerLynx On

Stern is a decent option for logging, but I might be mistaken about its ability to pull --previous logs. If you know how to do that, please share!

InquisitiveBeetle -

I think it lacks that ability, but if I'm wrong, I'd love to learn how to retrieve previous logs!

Answered By AlloyTraveler On

Another suggestion is to set up Alloy and send all logs and metrics to a remote Loki server. This way, even if the cluster shuts down, you'll still have historical logs to refer back to, including logs from past pods.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.