Where we are. In a year, everything has changed. Kimi - Minimax - Qwen - Gemma - GLM

I know benchmarks are questionable, imprecise on individual use cases, and LLMs are often trained to excel. But we're not talking numbers here. We're talking about a trend. When I was using GPT 4o or Sonnet 3.7, if you'd told me I could do all those things locally in such a short time, I wouldn't have believed it. Now it's happening. It's not just happening to those with 400GB of VRAM. It's also happening on affordable hardware. I think if Qwen 3.6 27b actually comes out soon, it will be truly incredible.