I just spent four hours dealing with a huge mess that started from what I thought was a harmless commit regarding Lambda Permissions. I figured sharing my experience could save someone else a Thursday nightmare.
Here's what went down: A Crossplane resource was committed using `lambda.aws.upbound.io/v1beta1`, while our cluster was set up to work with `v1beta2`. The conversion webhook failed due to a change in how the `loggingConfig` field is structured—what was once a map became an array. This error completely locked us out of all Lambda function resources, making even simple commands fail, and it even put our ArgoCD into a permanent unknown state.
I tried all the usual troubleshooting steps like disabling validating webhooks and restarting provider pods, but nothing worked. Ultimately, the only effective solution was drastic: I deleted the entire Custom Resource Definition (CRD), which wiped all Lambda functions, but it allowed Crossplane to recreate the CRD. Following that, I updated my manifests to the new version and reformatted the `loggingConfig` field. This experience taught me that sometimes, if you hit a deadlock with webhooks, you might have to take extreme measures to get back to normal operations.
Has anyone else experienced a similar deadlock with Crossplane or any other tool? What was your solution?
2 Answers
Honestly, I haven't heard anyone singing Crossplane's praises. It seems pretty hit or miss, depending on your use case. What was your experience with it?
I’m cautiously optimistic. I’m using it minimally, like just managing an S3 bucket in EKS. So far, it's been smooth sailing for our multi-namespace setup.
I still think Crossplane is a bad idea overall. It just complicates things unnecessarily. Anyone care to change my mind?
What makes you feel that way? I haven't used it yet, but the concept seems good to me.
Our R&D team loves it! They say switching to Crossplane was one of the best decisions we've made for devops. It just seems to work for what we're doing.