The joy and pain of training an LLM from scratch
r/LocalLLaMA
•
Generative AI
Mii-llm just released a detailed technical report on the development of the Zagreus and Nesso model families: a set of 0.4B parameter language models trained from scratch with a focus on edge deployment, multilingual capability, and European languages. The report documents the full pipeline behind a family of small language models designed for Italian, Spanish, French, and Portuguese, with bilingual pre