AI RESEARCH

Train Separately, Merge Together: Modular Post-Training with Mixture-of-Experts

arXiv CS.LG

ArXi:2604.18473v1 Announce Type: new Extending a fully post-trained language model with new domain capabilities is fundamentally limited by monolithic