I built a new Windows Server 2025 with SQL Server 2019 about a month ago, and it was working perfectly until yesterday. When I tried to connect to the cluster VIP with SQL Server Management Studio, I found I couldn't connect at all. After checking, I noticed that the additional IP addresses for the active nodes were missing, and the shared drives were also not showing up in Windows. Although I can see them in disk management, I can't bring them online or start the cluster.
From what I can tell, without the quorum drive, it seems like the nodes are competing for active status, which is causing problems. This is my first time setting up a Windows cluster outside of a DFS setup in the last 20 years. The shared drives are connected from a SAN and added to the primary node as an RDM disk. Has anyone experienced something like this? I reran the cluster validation, and the only errors reported were about disk storage. I'm not looking for a fix, just some documentation or resources to get me started troubleshooting this issue.
5 Answers
You should definitely review the cluster logs for any insights. Did you also check VMware's documentation on the recommended setup for SQL AAG/FCI? Sometimes it's easy to overlook essential steps, especially with storage adapters. If you've only lost the witness disk, it shouldn't make a difference operationally; something else might be wrong with one of the nodes, possibly related to VMware, Windows configurations, or networking issues.
A few months back, I had a tough time deploying a SQL cluster with Server 2025 too. It turned out one of the patches was causing a lot of failures after some extensive troubleshooting. Make sure your system is up to date and see if any patches could be affecting your cluster.
I've had a similar issue where the cluster went down because the clusdb file got corrupted. We managed to restore that specific file from a backup, replaced it on both servers, and restarted SQL, which got things running again. You might want to look into that.
Make sure to check the Event Viewer. It often has logs that can give you clues on what went wrong with the cluster, and it might help pinpoint the issue fast!
Check what Cluster Manager is saying first. If you genuinely lost the quorum disk, you might want to look up 'WSFC disaster recovery through forced quorum.' That approach could come in handy if things have gone south with your cluster!

Related Questions
Can't Load PhpMyadmin On After Server Update
Redirect www to non-www in Apache Conf
How To Check If Your SSL Cert Is SHA 1
Windows TrackPad Gestures