AI RESEARCH
Enhancing Mixture-of-Experts Specialization via Cluster-Aware Upcycling
arXiv CS.CV
•
ArXi:2604.13508v1 Announce Type: new Sparse Upcycling provides an efficient way to initialize a Mixture-of-Experts (MoE) model from pretrained dense weights instead of