AI RESEARCH

InfoLaw: Information Scaling Laws for Large Language Models with Quality-Weighted Mixture Data and Repetition

arXiv CS.CL

ArXi:2605.02364v1 Announce Type: new Upweighting high-quality data in LLM pre