Gemma 4 26b A3B is mindblowingly good , if configured right

r/LocalLLaMA
Open Source AI

Last few days ive been trying different models and quants on my rtx 3090 LM studio, but every single one always glitches the tool calling, infinite loop that doesnt stop. But i really liked the model because it is rly fast, like 80-110 tokens a second, even on high contex it still maintains very high speeds.