AI RESEARCH
Data-efficient pre-training by scaling synthetic megadocs
arXiv CS.LG
•
ArXi:2603.18534v1 Announce Type: new Synthetic data augmentation has emerged as a promising solution when pre-