For those who have experience writing Kubernetes operators, I'd love to hear about the struggles you've encountered. What specific parts of the process almost made you give up or threw you for a loop? Any funny or frustrating anecdotes are welcome!
2 Answers
While I didn’t exactly want to give up, writing solid tests for async control loops was a real struggle. It’s a bit of a nightmare. What worked for me was going all out with integration tests by setting up a kind instance, installing the operator in a local cluster, and watching how everything played out in real time. It’s definitely a lot of work, but it gets the job done!
I’d say dealing with the reconcile loop idempotency is a huge pain. It’s easy to make it work the first time, but then you realize it’s breaking on subsequent reconciles because of the 'resource already exists but drifted' case. Plus, updates to the status subresource can trigger endless reconciles, which is a classic headache. Kubebuilder hides some of this until you're in production, so it really hits hard when you find out the tough way. I've seen KRO or Crossplane handle simpler cases without needing to dive into Go, which can be a lifesaver.
Yeah, I totally get that! That’s why predicates are a must—helps avoid reconciling just because the status updates.
Exactly! Nobody seems to nail the reconcile loop idempotency the first time. It’s a learning process for sure. Have you found any workaround while building operators from scratch?

Sounds like a solid plan! The testing phase can be the most taxing part if you want to ensure everything’s functioning properly.