Why MOE below A10b feels like im gambling

r/LocalLLaMA
Generative AI

We've seen lots of MOE's coming out recently. While these do phenominal work at speed you pay the price in coherence. unless the MOE has at least 10b active-per-token. I often coded with these models and have been trying many different models the most recent i've found is: qwen3-coder-next, qwen3.5-35b, qwen3.6-35b and none of them come close to the level of stability i witnessed in qwen3.5-27b even qwen3.6-35b-A3b? WhileThe A3b MOE can solve the problem he often needs hand-holding and multi-turn steering.