LaCy: What Small Language Models Can and Should Learn is Not Just a Question of Loss

ArXi:2602.12005v3 Announce Type: replace Language models have consistently grown to compress world knowledge into their parameters, but the knowledge that can be pretrained into them is upper-bounded by their parameter size. Especially the capacity of Small Language Models (SLMs) is limited, leading to factually incorrect generations. This problem is often mitigated by giving the SLM access to an outside