Mistral NEMO upscale, but kinda weird

r/LocalLLaMA
Generative AI Open Source AI

March, 2026. I wanted to upscale, I wanted to prune. So why not have both? And why's the fish fat anyway? And is this even coherent at this point? It's coherent, follows instructions, knows new stuff, and new languages. The model is available here: It started as a normal Mistral Nemo, then it ate about 3B tokens, and absolutely unhinged modifications were made to it, making it thiccer at all the right(?) places. Basically, this is a highly experimental proper upscale of mistralai/Mistral-Nemo-Base-2407.