I'm stepping into a role where I've inherited a cloud setup with Linux VMs configured for SSSD across six domains. I've run into a frustrating problem with one specific server: when trying to query users in the '.bad.com' domain, it can take four minutes to retrieve group information from the domain controller. This delay often causes SSH timeouts before even getting prompted for a password. Interestingly, for users in other domains, the 'id' command returns results almost instantly—like 0.004 seconds. I've checked the network routes since all VMs are on the same setup and initial pings and traceroutes show no issues. I've also enabled debugging on the SSSD config and looked into performance tuning parameters, but nothing seems to work. I've tried replicating the setup on a test VM without recreating the issue, so I'm at a loss. Any tips or creative ideas would be appreciated!
4 Answers
If you're dealing with Active Directory, your issue may stem from how the bad.com domain's servers are managing requests to domain controllers. If they're querying a DC that lacks the full LDAP database, the process can get painfully slow. I suggest ensuring all non-read-only DCs are global catalogs. Check your replication setup to make sure it's happening smoothly as well. Running the dcdiag tool can be very helpful to troubleshoot this.
This could definitely be a DNS issue. I've seen similar problems where misconfigured DNS resulted in timeouts for SSH logins via SSSD because the system struggled to resolve PTR records. If the DNS isn't resolving correctly, it could be causing some of the delays you’re experiencing.
I recommend you double-check your network routing again. Local tests on the server might look fine, but record any abnormalities during different times of the day. Has there been any timeout recorded on the LDAP server logs? It's also worth checking the load and swapping on the machines, as those can slow down performance.
This definitely sounds frustrating! It might be worth checking if there's a network issue, particularly with IPv4 or IPv6, where one is blocked and affecting performance. Have you tried using ldapsearch or a similar tool to validate the LDAP connection? That could give more insight into the delays you're encountering.
Related Questions
Can't Load PhpMyadmin On After Server Update
Redirect www to non-www in Apache Conf
How To Check If Your SSL Cert Is SHA 1
Windows TrackPad Gestures