LeanLoop, the Tool Claude Leans on

So I bought a second graphics card the other week to get in on the local AI craze and I've been having the hardest time using it to build my website. It's been unreliable, the context gets eaten up, kind of hallucinates sometimes. I had to double check everything it has been very tricky. I use the cloud models too, expensive, but they're top quality. So the question becomes, how do I get the best of both worlds? This is my answer to subsidizing cloud API costs with my local LLM with a qwen3.6 35B A3B running at 32k context.