Qwen3.6 35B + the right coding scaffold got my local setup to 9/10 on real Go tasks

r/LocalLLaMA
Generative AI Open Source AI

I wanted to test a slightly different question than "can one open model beat GPT-5.4 Codex?" The question was: Can a combination of local models, scaffolding, repair loops, and routing policies running on home hardware get close enough to frontier coding models on my actual workload? Short version: yes, surprisingly. On my first curated 10-task Go eval set, a routed local process got to 9/10 passing tests.