I've been assigned the responsibility of conducting technical interviews focused on Kubernetes for engineering positions ranging from IC2 to IC4. Currently, we lack a structured process or standard operating procedures for these interviews, so I'm seeking input on what approaches have proven effective or unproductive in your experiences.
I'm particularly interested in the idea of setting up live troubleshooting labs to assess candidates' abilities to diagnose and resolve issues in real time. I find troubleshooting to be a fascinating skill, and I want to ensure I'm selecting candidates who can effectively navigate problems. One idea I've explored is using a tool like Killercoda for this purpose, although I need to verify if it's appropriate for business use.
For example, I might present a scenario where a deployment has been successfully applied, but no pods are starting. The underlying issue could be something like a missing secret. A more complex scenario could involve a pod that crashes every 90 seconds due to liveness probe issues from throttling. I'm aware that many in the community are not in favor of coding challenges, but I'm curious about your thoughts on live troubleshooting exercises instead.
5 Answers
For my recent K8s interviews, I had candidates fix a very basic crash looping deployment where the logs indicated a simple service port error. I also included a scenario where they needed to troubleshoot a failing liveness probe due to using the wrong endpoint/port. They seemed to appreciate the challenge! If you want, I can share the take-home portion I used, though I was skeptical about it initially, I ended up giving in to management's request.
For senior roles, I think it’s enough to engage them in conversation about their previous work and projects. We want to see their thought processes and get a feel for their soft skills rather than focus solely on technical problems. For junior roles, though, testing their troubleshooting skills is essential since they lack industry experience. It’s all about finding the right balance based on the role.
You can create practical scenarios that candidates might face, such as resolving issues where pods aren't starting due to resource constraints or dealing with a deleted secret causing crashes. This gives insight into their problem-solving abilities. Alternatively, using something like SadServers can provide ready-made troubleshooting scenarios for practice.
Great tip, SadServers could be really useful for this!
I’m not a fan of take-home or live tests since they often feel like trick questions. Instead, I prefer a conversational style where I set a scenario where the candidate has AWS and GitHub along with a Dockerfile, and we dive into discussing the right practices for each. I’ve found that the best candidates can articulate how they'd build infrastructure based on their past experiences, which reveals their problem-solving mindset. I throw in an impossible scenario at the end just for fun—like a rogue employee deleting pods—all in good humor!
That’s an interesting approach! It shifts the focus from stress to understanding.
In my last interview, I used a real incident we dealt with and probed how candidates approached solving the problem. I feel like it reveals more about their thought process than just memorizing commands. It's crucial to see how they think under pressure without being harsh about it, so I avoided reinstalling any systems during our chat.
How did they handle it? Sounds like a neat approach!

Could you send me that take-home test? I’m looking to practice my K8s skills!