Gemma 4 small model comparison

r/LocalLLaMA
Generative AI AI Safety Open Source AI

I know that artificial analysis is not everyone's favorite benchmarking site but it's a bullet point. I was particularly interested in how well Gemma 4 E4B performs against comparable models for hallucination rate and intelligence/output tokens ratio. Hallucination rate is especially important for small models because they often need to rely on external sources (RAG, web search, etc.) for hard knowledge.