I feel like I'm losing my mind over here. I've spent almost a week trying to figure out a frustrating GitOps issue with ArgoCD and Crossplane. The problem is that ArgoCD displays resources as 'Healthy' and 'Synced' even though Crossplane is failing to provision AWS resources. I keep getting 400 errors from AWS, but ArgoCD seems to be blissfully unaware of these issues. I'm experiencing problems like Lambda functions not updating and RDS instances getting stuck, all while ArgoCD's dashboard looks perfect.
What's really odd is that I couldn't find any information online about this problem. It's like I'm the only person using this combination who has noticed how the health checks are failing by using Lua logic that prioritizes 'Ready: True' before 'Synced: False.' I ended up fixing it by changing the order of the checks, but it baffles me that no one else seems to know about this.
Am I the only one dealing with this issue? Are other users not utilizing health checks with Crossplane? Do I just have bad luck? I documented my solution in detail since I want to help any future users who might run into the same issue. I even opened a GitHub issue to report it, so hopefully, it will get the attention it deserves!
5 Answers
Why is this a Medium article instead of a GitHub issue? It’s easier to fix the core issue than write an article.
Great job finding a workaround! But I gotta say, Medium articles can be a pain, especially when they’re behind a paywall. Maybe consider sharing your findings in a more accessible way?
Totally agree! Medium can be hit or miss, especially with those member-only stories.
I’m with you. If I have to pay to read a fix, forget it!
Thanks for sharing your experience! I'm considering switching to this stack, so your write-up is super helpful. Have you thought about creating a GitHub issue for this? It seems like it could impact a lot of users.
I did, but maintainers said it’s an edge case they’re not prioritizing right now. Still, sharing this has helped!
I’ve actually faced this issue before too. Luckily, I was already aware of ArgoCD's health check quirks when I started working with Crossplane. I assumed most folks would already know to customize health checks if the defaults didn't cut it.
I think a lot of people might not be aware as much as they should be. Definitely a good heads-up!
It seems like you might be interpreting GitOps and ArgoCD a bit differently. ArgoCD is reflecting that the resources are indeed synced, but any failures afterward aren’t something it tracks. You might need actual monitoring tools to catch these issues—GitOps is mainly about the state matching. The health checks from Argo aren’t meant to guarantee everything is running smoothly.
Yes, exactly! It's expected behavior from GitOps. You need something like Datadog or Prometheus for real system health.
That makes sense! I hadn’t thought of separating the monitoring aspect from deployment.
Because the maintainers don’t view this as a priority issue right now.