Been building a test-time compute pipeline around Qwen3-14B for a few months. Finally got results worth sharing.

r/LocalLLaMA
Generative AI

I'm a broke college student who got super tired of spending hundreds on Claude every month just to code on side projects. At the same time I was looking at how insane compute costs were to get a model that was barely capable for coding. So I thought, what if I could get a small local model to perform closer to frontier? I didn't think it was possible, but I tried anyway.