I'm getting so frustrated over an issue with ArgoCD and Crossplane. Despite ArgoCD showing resources as "Healthy" and "Synced", Crossplane is failing miserably to provision AWS resources, throwing 400 errors all around. Everything seems fine according to ArgoCD, but my Lambda functions aren't updating, RDS instances are stuck, and IAM roles aren't even created. I've scoured the internet for days, and there's hardly any mention of this issue. The health check logic seems broken; if 'Ready: True' comes before 'Synced: False', ArgoCD just considers everything okay, ignoring the failures happening in the background. Has anyone else encountered this? Are others doing health checks with Crossplane, or are you all monitoring AWS directly? Did I just end up with a rare configuration? I need to know I'm not alone in this struggle!
5 Answers
It's great that you found a workaround! But honestly, posting it behind a Medium paywall isn't cool. Maybe just stick to sharing it openly where everyone can access it, like on GitHub.
I steer clear of Medium too, especially content behind a paywall.
Reporting the bug is definitely more productive than trying to work around the problem. The condition checks should reflect the actual status correctly; that’s what needs fixing here!
Honestly, it seems like there's a misunderstanding about what GitOps and ArgoCD do. ArgoCD is showing that resources are synced, which is correct. The issues you're facing come after that, and it’s not ArgoCD's responsibility to check the health of those AWS resources. You should have other monitoring tools for that! It’s expected behavior for GitOps—it’s not about everything being perfectly healthy, just that your infrastructure reflects what’s declared.
Right? ArgoCD isn't a health dashboard; it's meant for continuous deployment. Relying solely on it for health might lead to unnoticed issues.
You're spot on! ArgoCD keeps things in sync, but for actual health checks, you'd need a service like Datadog or Prometheus to catch those errors.
I ran into this issue a while ago. When I started using ArgoCD, I learned early on about the health check behavior. It seemed everyone I knew who used Argo had a handle on writing custom health checks for their resources so they wouldn’t fall into this trap.
Seems not everyone shares that knowledge, though. It’s an important lesson to learn.
Why isn't this documented more clearly? Instead of just putting your findings on Medium, have you considered opening an issue on GitHub to see if others have faced this? It feels like your findings could help many people who might be unaware of this problem.
I thought about it, but after speaking with Crossplane’s maintainers, they suggested this is more of a community issue right now. I figured sharing my article could still help!
Totally agree! Medium's member-only feature is a bummer.