Qwen3.5-397B-A17B reaches 20 t/s TG and 700t/s PP with a 5090
r/LocalLLaMA
•
Generative AI
I could not find good data points on what speed one could get with a single 5090 and enough DDR4 RAM. My system: AMD EPYC 7532 32core CPU, ASRock ROMED8-2T motherboard, 256GB 3200Mhz DDR4, one 5090 and 2TB NVME SSD. Note that I bought this system before RAM crisis. 5090 is connected at PCIE4.0 x16 speed. So, here are some speed metrics for Qwen3.5-397B-A17B Q4_K_M from bartowski/Qwen_Qwen3.5-397B-A17B