Gemma 4 31b shapes up to be one of the most cost-effective models ever

r/LocalLLaMA
Open Source AI

Artificial Analysis came out with some first tests. Now there are some questions about how comparable the cost to run metric is across all models but so far it indicates that it is much token efficient than Qwen and providers take way less for the tokens - albeit benching slightly worse overall. Even when running a model over API to do some not hyper complex tasks this seems extremely promising especially if you dont need a blazing fast answer. submitted by /u/tobias_681 [link] [comments.