MiniMax-M2.7 vs Qwen3.5-122B-A10B for 96GB VRAM full offload?!
r/LocalLLaMA
•
Generative AI
Tl;dr; For 96GB VRAM full offload rigs, I'd probably choose Qwen3.5-122B-A10B over MiniMax-M2.7 today. Curious what y'all experience is. Quants Tested ubergarm/MiniMax-M2.7-GGUF IQ2_KS 69.800 GiB (2.622 BPW) ubergarm/Qwen3.5-122B-A10B-GGUF IQ5_KS 77.341 GiB (5.441 BPW) Rambling Details Its amazing now we have multiple open weights LLMs that work pretty well for local vibecoding! Both quants tested and work well enough with opencode configured to enable/disable thinking dynamically (really speeds up generating 5 word thread title lol.