Has Anyone Else Faced This Silent Failure Issue with ArgoCD and Crossplane?

0
3
Asked By CloudyDreamer42 On

I've been battling with a frustrating problem while working with ArgoCD and Crossplane. Despite ArgoCD showing that everything is "Healthy" and "Synced", Crossplane is failing to provision AWS resources, throwing numerous 400 errors. This is leading to issues like Lambda functions not updating and RDS instances being stuck. The problem seems to be related to how ArgoCD's health checks process conditions in array order, which causes it to ignore failures if a "Ready: True" condition appears before a "Synced: False". I've done extensive searches online but found no information on this issue, making me feel like I'm the only one dealing with it. I fixed the health check logic myself, but I'm shocked that this doesn't seem to be a known problem. Have others encountered this? Are most people not using health checks with Crossplane, or just monitoring AWS directly? Am I just really unlucky?

5 Answers

Answered By OldSchoolDev On

Yeah, I faced a similar issue in the past. Luckily, I read up on Argo's health check behavior beforehand. It’s crucial to write custom health checks for your resources since the defaults might not work well with Crossplane.

NewbieNerd -

I agree! Many people may not realize this, which could lead to silent failures.

Answered By HonestDev52 On

I think you might be misunderstanding GitOps principles here. It sounds like ArgoCD is doing its job since the resources are in the desired state. But if Crossplane is failing afterwards, that's on it, not Argo. You need observability tools to catch these issues, not just rely on ArgoCD's checks.

CuriousCoder99 -

That's a solid point. It’s important to have monitoring in place to really know what's happening.

DataDynamo -

Exactly! Argo's role is to ensure your cluster state matches what you declared, but it doesn't ensure everything is healthy.

Answered By SystemSleuth On

Why post about this on Medium instead of GitHub? Opening an issue could lead to a fix that helps everyone, not just you.

BetterDev2023 -

The maintainers think it's not crucial enough right now. It’s more of a community problem.

Answered By HelpfulHarry On

Thanks for sharing your experience! We're considering migrating to a similar setup, and it's good to know about potential pitfalls. Have you thought about raising a GitHub issue for this? It could really help others down the line.

CloudyDreamer42 -

I did consider it, but the maintainers felt it's an edge case and are focusing on other priorities for now.

Answered By TechieTraveler71 On

It's great you found a workaround! Still, posting it on Medium as a "Member-only" could limit who sees it. Why not make it more accessible?

DevOpsGuru12 -

Yeah, it would be better if you shared it openly where more can benefit from it.

HackerJay88 -

Agreed, Medium can be such a hassle with its paywall.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.