I'm running a small AI content creation setup, mainly focusing on image and video models. Last year, I opted for a Xeon Gold 6348 with 28 cores, as I found Xeon Silver to be slightly underwhelming for multi-GPU inference tasks, and Xeon Platinum was just too pricey and power-hungry for my needs. The Gold handles four RTX 4090 GPUs running continuously without any thermal throttling, given that I have decent airflow and a well-ventilated 2U chassis. It draws between 1.2 and 1.4 kW at maximum load during generation, and I keep the rack temperature stable using perforated doors and additional intake fans. I believe Xeon Gold strikes a great balance with enough PCIe lanes for GPUs, strong multi-thread performance, and lower energy consumption compared to Platinum. Silver can be fine for lighter workloads but tends to become a bottleneck with multiple jobs running at once. I'm curious about what Xeon tier you guys are using for AI workloads and how hot your racks get during operations.
3 Answers
Have you thought about using EPYC instead of Xeon? They tend to have more PCIe lanes and offer better clock speeds at similar prices. It might be worth a look, especially for multi-GPU setups.
Yeah, with EPYC, you get better performance per core and overall power efficiency. Definitely consider it!
Does my Xeon E5310 from 2007 count? It's old but still chugging along!
Sure, it technically counts as a Xeon, but keep in mind it's pretty outdated compared to what’s currently available!
Just a heads up, 6th gen Xeons have moved away from metal tiers. In fact, what's currently used in the B300 from Nvidia is linked here if you want to check it out.

Totally agree! EPYC is looking like the go-to for multi-GPU inference this year, especially since it has great core scalability and efficiency.