Final Monster: 32x AMD MI50 32GB at 9.7 t/s (TG) & 264 t/s (PP) with Kimi K2.6

32 MI50 32GB setup moonshotai/Kimi-K2.6 int4 @ 9.7 tok/s (output of 136 tok) and 263 tok/s (input of 14564 tok) on vllm-gfx906-mobydick Github link of vllm fork: Power draw: ~640W (idle) / ~4800W (peak inference) Is it worth? No, unless you’ve got solar panels or free energy… Setup details: That’s just 2 nodes of 16 GPU that i plugged together with 10G cable ethernet.