Hey everyone, I've recently set up the descheduler in my cluster, but I keep running into a frustrating issue. Sometimes it throws an error like this:
```
E0708 06:51:40.296421 1 server.go:73] "failed to run descheduler server" err="Get "https://10.96.0.1:443/api": dial tcp 10.96.0.1:443: i/o timeout"
E0708 06:51:40.296494 1 run.go:72] "command failed" err="Get "https://10.96.0.1:443/api": dial tcp 10.96.0.1:443: i/o timeout"
```
This issue only happens sometimes, and overall descheduler runs fine, so I can't pinpoint what's causing the timeouts. For context, I'm using Talos Linux v1.10.5 with Kubernetes v1.33.2 and Cilium CNI. No other pods are experiencing this issue, and the API server seems to be working well. Any suggestions on what might be wrong? Thanks!
3 Answers
That IP looks like it could be from the default service range. You might want to check if you're using an alternative IP range somewhere that's conflicting. Usually, it should just work if no configuration changes were made, but you never know!
How are you running the descheduler? I set mine up on a Talos cluster too, but I've been running it as a cronjob and haven't had any issues so far. Might be worth checking how your setup differs.
Do you have any network policies that could be interfering? Sometimes those can cause unexpected connectivity issues, even if everything looks fine at first glance.
There are no network policies affecting it, and it runs in the same namespace, so I’m not sure what the issue could be.
I haven't modified the service IP range, and when I try to curl the same address from other pods, it works fine.