Gemma 4 26b A3B is mindblowingly good , if configured right
r/LocalLLaMA
•
Open Source AI
Last few days ive been trying different models and quants on my rtx 3090 LM studio, but every single one always glitches the tool calling, infinite loop that doesnt stop. But i really liked the model because it is rly fast, like 80-110 tokens a second, even on high contex it still maintains very high speeds.