I'm trying to help a friend with their PC build that's been experiencing random crashes and errors that seem unrelated. The symptoms suggest a faulty GPU, but there were initial motherboard fault codes that pointed to potential RAM issues. Before going through the RMA process with MSI, I wanted to see if anyone has suggestions on what else to look into.
Here are the symptoms we're facing:
- The system inconsistently crashes without any error screen, and this started about a week after the initial build. Sometimes it runs for hours, while other times it crashes right after booting up. The logs show a TDR failure (0x00000116).
- We've noticed some graphical glitches and occasional power loss to USB devices along with monitor display issues from the GPU.
- There's also a DRAM error LED that lights up during boot, and even when using known good RAM, the issue persists.
- At times, the system throws failure codes related to both the CPU and GPU.
Attempts to resolve the issues include:
- Reinstalling the OS and updating all drivers and BIOS.
- Replacing the motherboard twice due to errors even with the good RAM.
- Replacing the CPU due to motherboard failures.
- Replacing the PSU, which now has a 12v 2x6 connector for the GPU.
- Extensive reseating of RAM and GPU.
The specific components are:
CPU: AMD Ryzen 7 9800X3D
GPU: MSI GeForce RTX 5080 INSPIRE 3X OC
RAM: 2x Kingston 32 GB KF564C32-32
Motherboard: ASUS TUF X870E-PLUS WIFI7 (Current)
PSU: Corsair RM1000e
2 Answers
When experiencing such random issues, it’s worth checking if something is overheating, which could cause cascading failures. If possible, use a thermal camera during startup to spot overheating components. A thorough cleaning with isopropyl alcohol and a soft toothbrush can help if there's any dust buildup. If you're comfortable, you might also want to replace the thermal paste on the CPU or GPU since old paste can lead to overheating.
Have you tried running a memory test with memtest86 or memtest86+? It could help rule out RAM issues. Also, if you're overclocking any components, consider undoing those settings.
I ran memtest and other diagnostics, and everything indicates the RAM is fine. The crashes happen with overclocking turned off as well.

Thanks for the tip! However, everything is new, and I've replaced most parts at least twice. I’ve checked for overheating, and the problem can occur right after the PC has been at room temperature for hours.