I've been wrestling with a frustrating issue that seems like a nightmare in the GitOps world. ArgoCD is showing resources as 'Healthy' and 'Synced', but at the same time, Crossplane fails to provision AWS resources—like, big time. Despite AWS throwing 400 errors left and right, ArgoCD confidently tells me that everything is just great! This has been a huge pain point, especially with Lambda functions not updating and RDS instances stuck.
What's really baffling is that I've searched extensively, but found almost no information on this. It's as if I'm the only one facing this broken health check logic. Basically, if 'Ready: True' appears before 'Synced: False' in the conditions, ArgoCD gives a false green light while chaos brews in the cloud.
I'm curious to know if I'm the odd one here. Does nobody else have issues with health checks in this context? Are people just monitoring AWS directly instead of relying on ArgoCD? When I finally tweaked the Lua check order to prioritize error conditions first, I got a sense of relief, yet I'm shocked this isn't more widely recognized—especially since the default health checks have this flaw. Am I just ultra-unlucky, or is there a bigger conversation to be had here?
5 Answers
Honestly, you might be misunderstanding how GitOps and ArgoCD function. ArgoCD's behavior is to ensure that the resources declared in your repo match those in your cluster. As far as it's concerned, if the resources are there and synced, it’s doing its job—even if Crossplane is later hitting errors. The key is to have separate monitoring in place for runtime health, not just rely on ArgoCD.
Glad you found a workaround, but I gotta say, Medium articles with 'Member-only' tags are a bit of a turn-off. It feels like a paywall over something that should be more accessible to the community.
I've actually run into this situation before. Most folks using ArgoCD do custom health checks. It's pretty common knowledge that default health checks will sometimes fail to capture everything accurately. I would suggest having proper monitoring tools like Datadog or Prometheus to catch those hidden issues.
Thanks for sharing your experience! I'm considering a migration to a similar setup. Have you thought about creating that GitHub issue? This is something that could potentially affect a lot of people.
Why isn’t this a GitHub issue instead of a Medium article? Seems like you’re on to something that might help a lot more people if it gets the right visibility.
Related Questions
Sports Team Randomizer
10 Uses For An Old Smartphone
Midjourney Launches An Exciting New Feature for Their Image AI
ShortlyAI Review
Is Copytrack A Scam?
Getting 100 on Pagespeed Insights for Mobile is Impossible