Issues After Building a Windows SQL Cluster

0
12
Asked By TechieTornado25 On

I recently set up a Windows Server 2025 with SQL Server 2019 to run a cluster. Initially, everything was working smoothly, and I even loaded data onto it while waiting for vendor testing. However, when I tried to connect to the cluster's VIP using SQL Server Management Studio yesterday, I couldn't get access. Upon checking the VMware virtual machines, I noticed that the additional IPs for the active nodes were missing, and the shared drives didn't appear in Windows. Although they show up in Disk Management, I can't bring them online, and I'm unable to start the cluster. Looking at the datastore for the first node, I can see the shared drives, but it seems like without the quorum drive, the nodes are in conflict over who is active. This is my first time building a Windows cluster, excluding a DFS cluster. The shared drives are from a SAN and were added as RDM disks to the primary node. Has anyone experienced something similar? I checked the cluster validation and noted that the only errors were related to disk storage. I'm just looking for some resources to help troubleshoot this issue.

6 Answers

Answered By PatchFinder82 On

Not sure if this is related, but I had serious troubles setting up an SQL cluster with Server 2025 a couple of months back. After a ton of troubleshooting, I discovered it was actually a patch that caused the failures.

Answered By VMNetworkNerd On

Make sure to thoroughly review the cluster logs. Also, double-check VMware’s documentation for SQL Always On Availability Groups and Failover Clustering Instances. Sometimes there are missed configurations, especially with storage adapters. Given you have two nodes, if only the witness disk is lost, you shouldn’t see a significant operational change, so something else might be off, possibly in the networking setup.

Answered By DiskDoctor73 On

In my experience, a cluster can fail if the clusdb file gets corrupted. We once restored it from backup—just that single file—then placed it on both servers and restarted SQL. It worked for us!

Answered By TechSavvyTinkerer On

You might want to use Always On clustering with regular disks instead of FCI with RDM. That could help eliminate some issues.

Answered By EvntViewerGuru99 On

Have you looked at the Event Viewer? It can provide insights into what might have gone wrong.

Answered By QueryQuester42 On

You might want to check the Cluster Manager to see if there's any indication of what’s wrong. If you really lost the quorum disk, you can search "wsfc disaster recovery though forced quorum" for guidance on recovering your cluster.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.