Been building a test-time compute pipeline around Qwen3-14B for a few months. Finally got results worth sharing.
r/LocalLLaMA
•
Generative AI
I'm a broke college student who got super tired of spending hundreds on Claude every month just to code on side projects. At the same time I was looking at how insane compute costs were to get a model that was barely capable for coding. So I thought, what if I could get a small local model to perform closer to frontier? I didn't think it was possible, but I tried anyway.