AI RESEARCH

Data-efficient pre-training by scaling synthetic megadocs

arXiv CS.LG

ArXi:2603.18534v1 Announce Type: new Synthetic data augmentation has emerged as a promising solution when pre-