I've been hitting my head against the wall with a frustrating issue between ArgoCD and Crossplane. Despite ArgoCD showing resources as "Healthy" and "Synced," Crossplane is actually failing to provision AWS resources - I mean, real problems with 400 errors popping up left and right. While ArgoCD gives a thumbs up, my Lambda functions aren't updating, RDS instances are stuck, and IAM roles are missing. I've searched high and low for information on this but found nothing. It seems like I'm the only one dealing with this silent failure where the health checks seem fundamentally broken. Is anyone else using health checks with Crossplane? Or are we all just turning a blind eye to ArgoCD's status? I managed to fix it by changing the order of condition checks, but I'm baffled that this isn't a common topic. Am I just super unlucky, or does this indicate a flaw in how many are using these tools?
4 Answers
You might be misunderstanding how GitOps works. ArgoCD is just ensuring that the resources in your cluster match what’s declared - it’s not supposed to be a complete health monitor. If Crossplane is having issues, that’s actually on Crossplane to fix, not ArgoCD. You should set up proper monitoring tools to catch those errors instead!
Couldn’t agree more. Building out proper observability is key!
I had a similar dilemma a while back. Thankfully, I was aware of how ArgoCD health checks work before I tackled Crossplane. I bet many don’t realize they might need to implement custom health checks since the default ones can be misleading. People really should test and customize their setups!
Yeah, the default checks can be pretty wonky. Testing is key!
For sure, I'm learning that the hard way!
Glad you found a workaround! But honestly, Medium's subscription model kinda sucks. People are looking for accessible info, and not everyone can pay up. You should consider posting your findings straight to GitHub instead for better reach!
Totally agree! An open issue on GitHub would have way more visibility.
I see what you're saying. I'll take it into consideration!
Interesting that you opted for a Medium post instead of reporting a GitHub issue. It might be more effective for actual changes to happen. Maybe you should still consider that option?
Yeah, it's a tough call. I want to make sure it gets the right attention!
The maintainers didn't see it as urgent, but I get it.
Exactly, ArgoCD isn’t a health dashboard. You need tools like Datadog or Grafana to really keep an eye on your infrastructure.