AI RESEARCH

Beyond Sunk Costs: Boosting LLM Pre-training Efficiency via Orthogonal Growth of Mixture-of-Experts

arXiv CS.LG

ArXi:2510.08008v2 Announce Type: replace As the computational demands for pre-