Hey everyone, I'm pulling my hair out over a major issue with ArgoCD and Crossplane that's been driving me completely batty this past week. The problem is that ArgoCD shows my AWS resources as "Healthy" and "Synced," even though Crossplane is actually failing to provision them! We're getting a bunch of 400 errors from AWS, but ArgoCD's dashboard is just like, "Everything's cool!"
I'm seeing things like Lambda functions not updating, RDS instances stuck forever, and IAM roles not being created, all while ArgoCD sits there with its reassuring green lights. I've been scouring the internet for answers on this and found pretty much nothing – no blog posts, no questions on Stack Overflow, nothing in GitHub issues. It's feeling like I'm the only one who's noticed that ArgoCD's health checks are totally off.
The Lua logic for those health checks seems to be checking conditions in a way that can give a false sense of security: if `Ready: True` comes before `Synced: False`, ArgoCD just assumes everything is okay, even if resources are failing left and right. I've been wondering if anyone else has run into this issue or if I'm just really unlucky or configured something wrongly. Has anyone not been using health checks with Crossplane? Or are people just looking at AWS directly instead of trusting ArgoCD? Please tell me I'm not the only one experiencing this!
4 Answers
I ran into a similar issue ages ago. Good on you for figuring out how to navigate these health checks! Wish more folks knew about this stuff. Check your configurations or customize those health checks since the defaults might not cut it sometimes.
What’s up with all this Medium talk? Why not report this directly to GitHub instead? This issue sounds serious and could really help others if it's fixed.
Because the maintainers see this more as a community issue. But sharing it on GitHub could definitely help raise awareness!
Honestly, it sounds like you're misunderstanding how ArgoCD and GitOps work. ArgoCD is correctly showing that resources are synced, even if Crossplane has issues later on. The idea is to have monitoring in place outside of ArgoCD to catch these failures. It's not meant to handle every failure scenario on its own; that's where something like Prometheus or Grafana comes in!
I've seen this too! Argo isn't a complete health monitoring solution, it's more about deployment consistency.
Exactly! ArgoCD just ensures your cluster state matches what you define. The health checks often need to be tailored, especially for custom resources like those from Crossplane.
Glad you found a workaround! Just a heads-up, though: Medium's kind of a pain with its member-only stories. Might wanna stick to GitHub for sharing stuff like this in the future!
Yeah, member-only is a dealbreaker for me too. Just keep it open and accessible!
Totally agree! Medium just isn't the best for this.
I still think this info should be more widely known. It's super helpful for everyone working with similar stacks.