AI RESEARCH
Beyond Sunk Costs: Boosting LLM Pre-training Efficiency via Orthogonal Growth of Mixture-of-Experts
arXiv CS.LG
•
ArXi:2510.08008v2 Announce Type: replace As the computational demands for pre-