Gemma 4 models feel very different depending on size (26B vs 31B)

r/LocalLLaMA
Open Source AI

I spent a few hours trying out the new Gemma 4 models, and one thing that stood out pretty quickly - the difference between sizes is noticeable than I expected. Didn’t run any formal benchmarks, just hands-on usage. Tested: Gemma-4-26B-A4B-it Gemma-4-31B-it Mostly used them for: some coding (Python + small scripts) general prompts a bit of longer / slightly complex instructions 🧠 31B (Gemma-4-31B-it) This one feels a lot stable once prompts get even a little complex.