M5 Max 128GB with three 120B models
r/LocalLLaMA
•
Generative AI
Nemotron-3 Super: Q4_K_M GPT-OSS 120B: MXFP4 Qwen3.5 122B: Q4_K_M Overall: Nemotron-3 Super > GPT-OSS 120B > Qwen3.5 122B Quality wise: Nemotron-3 Super is slightly better than GPT-OSS 120B, but GPT 120B is twice faster. Speed wise, GPT-OSS 120B is twice faster than the other 2, 77t/s vs 35t/s ish submitted by /u/albertgao [link] [comments]