I'm really frustrated right now! I've been stuck for a week trying to solve a major issue involving ArgoCD and Crossplane. The problem is that ArgoCD shows resources as "Healthy" and "Synced" even when Crossplane is failing to provision AWS resources. Despite receiving constant 400 errors from AWS, ArgoCD's dashboard gives off a false sense of security, making it seem like everything is fine while Lambda functions are not updating, RDS instances are stuck, and IAM roles aren't being created. I've looked everywhere for guidance on this issue, but I can't find anyone else who seems to be facing this problem. It appears that the health check logic in Lua for ArgoCD might be broken, as it processes statuses in a way that can misrepresent the actual state of resources. I've had to manually fix this by reordering the health checks, but I'm surprised no one else seems to know about this flaw. Can anyone relate to my struggle? Are people just skipping health checks with Crossplane? Am I just that unlucky?
5 Answers
Why post on Medium when you could just create a GitHub issue? If it's causing problems, they should know about it, right?
I've been there too! A while back, I figured out how Argo's health checks work before using Crossplane. It's important to write custom health checks for your resources, as the defaults don't cover everything.
Totally agree! This isn't a fact that's widely recognized, and I think more users should be aware of it.
Thanks for detailing your experience! We're considering moving to the same tech stack, and your input might save me a lot of future headaches. Have you thought about filing a GitHub issue to make more people aware? It could help many others deal with it before they even find out the hard way.
I did think about filing an issue, but after discussing it with the maintainers, they said it’s more of a niche issue involving AWS providers. I'm hoping my article will help others in a similar situation!
Honestly, it sounds like you're misunderstanding how GitOps and ArgoCD work. ArgoCD is doing its job—it's syncing the resources in the cluster. The issue with Crossplane failing after that isn't ArgoCD's fault. You really should have monitoring in place for the health of your deployments, not just rely on ArgoCD's status check.
Yeah, exactly! ArgoCD is just a deployment tool; you need something like Grafana or Prometheus to actually track the health of your resources.
It's great to hear you found a workaround! Just a heads-up though, Medium articles can be tricky since they often have paywalls. Maybe posting it directly on GitHub would reach more people who could benefit from your solution.
For sure! Medium can be a pain with the member-only stuff. Why not make it publicly accessible?
I talked to the Crossplane maintainers, and they don’t see this as a critical problem. It seems to be more of a community thing, unfortunately.