AI RESEARCH
Synthetic Mixed Training: Scaling Parametric Knowledge Acquisition Beyond RAG
arXiv CS.LG
•
ArXi:2603.23562v1 Announce Type: new Synthetic data augmentation helps language models on synthetic tokens or using stronger generators yields diminishing returns below the performance of RAG. To break the RAG ceiling, we