Hey everyone, I'm having a bit of a crisis with my Dell R650 server. For some reason, all my virtual machines have started running super slow recently. I'm running the latest vSphere 8, and without any changes, the CPU usage on the guest VMs is consistently hovering around 95%, even when there's nothing going on. This isn't just a one-off issue as it's happening on several hosts. I've ensured that all firmware is up to date, wiped and reinstalled ESXi and Hyper-V, and the VMs are located on local SAS storage. Dell diagnostic tests show no problems, and while I checked using ESXtop, everything seems normal. I even tried switching to a different drive for the OS, but the issue persists. I'm running out of ideas on what to check next. Any thoughts?
5 Answers
What do your performance logs show? And when you mention different drives, how many are you using? What types are they, and is the RAID controller configured correctly? It’s crucial to ensure you have enough physical resources for your kernel to function properly. Also, what happens if you move these VMs to a different physical server to see if the problem follows?
Have you looked into what's happening within the VMs that might be causing that high CPU usage? It sounds like something's bogging them down. I've seen VMs acting uselessly slow—like just opening File Explorer can send the CPU spiking to 95-100%. They eventually settle down but it takes time and it keeps happening even during idle periods.
That's exactly it! The VMs are super laggy, and even minimal tasks make the CPU spike. They just don’t behave the way they should.
Have there been any changes enabling virtualization-based technologies in the VMs recently? Sometimes that can also impact performance, although if it works fine in other clusters, it might not be the root cause here.
No changes, really. Everything was running fine in the other cluster. It's just weird that one day it all slowed down.
I’ve run into a similar issue before where a server seemed fine but was dangerously slow. In my case, the power draw was at 100% on the power supplies for no clear reason—it ended up being a long-standing issue. It’s worth checking if your metrics show spikes like that over time as well.
Interesting to note! I haven't checked power metrics recently, but I will certainly look into that. Everything seems fine on the hardware end, though.
How many VMs are set up to use as many vCPUs as you have physical cores? It’s usually a good idea to add an extra vCPU if you hit around 80% CPU usage, especially in heavy workloads like this.
Right now, I've only got one VM using the resources. Nothing’s been changed or rebooted in months, then suddenly it all went downhill.

The logs looked pretty normal overall. This is a vSAN host with a bunch of drives, and the same issues arise using vSAN, iSCSI, or even with local storage. I tried rebuilding without vSAN; no luck there either.