I've got a custom rackmount server running Ubuntu that has started freezing up unpredictably. Sometimes it will be working fine for almost a week, and other times it locks up in less than 24 hours. The screen shows the login splash screen, and while I can still ping the server, I can't SSH into it at all. I haven't made any changes to the hardware or software before the issues started. This server is specifically dedicated to running Digital Watchdog camera software, and it was built about six months ago. Here are the key specs:
- AMD Ryzen 9900X
- MSI X870E Carbon Wi-Fi motherboard
- 32GB G.Skill Flare X5 DDR5 RAM
- 2x Samsung 990 Pro 2TB NVMe SSDs
- Broadcom 9500-8i HBA card with 8x 14TB hard drives in RAID-6
- Intel X550T2 10Gb network adapter
I've tried a bunch of troubleshooting steps: running memtest, checking drives, reinstalling Ubuntu, and monitoring CPU temps. I still need to remove the HBA and network cards for testing. I've looked at logs but haven't found anything helpful. Any suggestions on how to further diagnose this issue?
5 Answers
What’s showing up in your logs? They might give you a clue about what’s causing the freezes.
Do you have a swap file set up? Even with 32GB of RAM, if you run out and don’t have swap, it could freeze. You might need to increase your swap space if that's the case.
Make sure to check the IO wait in your CPU usage. If you're seeing high IO wait, it may correlate with the freezing, and tools like iotop can help pinpoint what's causing it.
Great idea! I’ll add that to my monitoring script too.
Your system may be running out of RAM and hitting swap, which can cause unresponsiveness. Try using `top` with 'RAM usage' sorted to see what's consuming memory.
Good idea! I'm adding memory monitoring to my script to keep an eye on usage.
Have you checked if your boot drive is full? I've seen systems acting weird when log files balloon and fill up the drive, causing lock-ups.
Unfortunately, it's not full. But thanks for the suggestion!

I’ve got a 2GB swap file, but I plan to monitor RAM usage more closely.