Hey everyone! I'm facing a peculiar issue with the DNS on my Windows Server 2019 Domain Controller (DC). Our local devices use the DC for DNS, which forwards queries to other servers for external sites. For example, the main site example.com works perfectly, and nslookup resolves it without any issues. However, there's a subdomain, online.example.com, that doesn't return an IP address when using nslookup, resulting in users being unable to access it. The strange part is that clearing the DNS cache on the DC temporarily fixes the issue, allowing access to the subdomain until it happens again after some time. We never had this issue with Windows Server 2008 R2. While I have a few temporary fixes in mind, I'm really keen on understanding the root cause of this problem and why it specifically affects this subdomain. I've checked the logs and run DNS diagnostics, but I'm still stumped!
4 Answers
Have you tried running nslookup in debug mode? It might give you a clearer picture of what's going on with the DNS queries.
This seems more like a negative caching issue. When your DC queries upstream and gets a 'no address' response for the subdomain, it caches that answer for a while. During that time, it keeps returning 'no address' even if the record is actually valid elsewhere. Flushing the cache resets this temporarily, which explains the intermittent access. It likely stems from the upstream DNS having some inconsistencies. Make sure your forwarders are reliable and consistent, and look at the negative cache TTL settings on your DC.
I completely relate to your desire to understand the root causes before applying fixes! It’s tough to implement or troubleshoot without knowing what's really going wrong, especially when it's not urgent.
You should check the TTL settings on the subdomain causing issues. I noticed that Windows DNS can behave erratically when the TTL is set too low and the DNS records are large. I ended up fixing a similar issue by using a Linux DNS server for local domains with correct conditional forwarding.

I’ll definitely keep an eye on possible negative caching issues. While I don't think the forwarders are returning bad responses, I can't rule it out entirely. Disabling negative caching might be a temporary workaround, but I really don’t want to go that route unless I have to.