Trouble Fixing AD and DNS Issues on Domain Controllers

0
8
Asked By TechNinja234 On

I've recently taken on the task of troubleshooting a VPN DNS issue with a Meraki MX, where Cisco Secure Client users are unable to resolve internal hostnames. Early on, we discovered the VPN adapter was missing a DNS suffix configuration and that IPv6 was prioritized over IPv4, causing resolution issues for both clients and servers. However, it became more complicated when we identified that Active Directory replication between two domain controllers (HBMI-DC02 and HBMI-DCFS01) has been broken since March 15th.

As I delved deeper, I encountered numerous errors: repadmin failing with an Access Denied message, dnscmd showing ERROR_ACCESS_DENIED and RPC_S_SERVER_UNAVAILABLE, and Server Manager unable to connect to DNS. Even after switching to a domain admin account, these issues persisted. Furthermore, DCFS01 was resolving DC02 using IPv6 link-local addresses instead of IPv4, a problem I fixed by disabling IPv6 at the kernel level, but I still couldn't resolve the overarching issues.

We attempted to uninstall and reinstall the DNS role on DCFS01, but that seemed to exacerbate the problems. Both domain controllers began exhibiting abnormal behavior; core Active Directory services appeared to be orphaned processes that weren't manageable. We observed indicators like Event ID 1202 logging about NTDS and various Kerberos errors, leading us to discover that the local security policy on DC02 was corrupted, stripping essential service logon rights. After applying a fix to restore these rights, the problems continue, with services failing to respond properly.

Given the complex situation, including the long-term replication issues, orphaned services, and unusual error messages, I suspect there might have been some sort of corruption or misconfiguration that worsened the environment. Now, it seems we're facing a degraded situation, with at least one client machine reporting no available logon servers. Looking for help in how to proceed without making the situation worse. Any ideas?

5 Answers

Answered By EventLogHero On

It sounds like you've already diagnosed some of the fundamental issues, but I would definitely recommend digging into the event logs again. Make sure the time synchronization is functioning properly; incorrect date and time settings can lead to a cascade of problems! Double-checking this could shed light on what might have triggered your current mess.

Answered By BackTrackSam On

The best course of action might actually be to assume any problematic DCs are compromised and shut them down completely. You can then remove them from AD, create a new domain controller, and let it take on the roles. Just make sure to thoroughly check the Group Policies, especially those changed around March 15th, as they could be contributing to your issues. You should keep your DCs dedicated to their services without overloading them with other tasks.

Answered By ServerBuster99 On

You might want to examine the FSMO roles and see if the role holder is in good condition and whether it resolves DNS properly. If the problematic DCs are beyond saving, consider a force removal. It's not pretty, but given the chaos, it sounds like they may have been purposefully sabotaged during the transition.

Answered By FixItFrank87 On

First off, make backups of both domain controllers. If you're unsure about the AD restore mode password, reset it on each server. Decide which of the two is in better shape—hopefully, it's the VM. Take the other DC offline permanently. Fix the healthy one, seize the FSMO roles, and completely remove references to the faulty DC from AD and DNS. You might want to set up a new DC in the meantime so that once you've cleaned things up, you can easily promote the new server and restore order. It's a bit of a rough approach, but sometimes that’s what you need to do in these messy situations!

Answered By CloudSurfer92 On

You really need to focus on whittling down the mess to just one operational DC. Once you have that healthy DC, ensure it's the authoritative one. From there, the goal would be to introduce a fresh DC after you've cleaned the environment. A lot of this inconsistency you’re seeing could be the result of having a second DC that’s either non-functional or seriously misbehaving. Once the new one is in place, check that everything's functioning as it should.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.