AI RESEARCH

Boundary Mass and the Soft-to-Hard Limit in Mixture-of-Experts

arXiv CS.LG

ArXi:2605.02124v1 Announce Type: new Softmax-routed mixture-of-experts models approach hard routing as the temperature tends to zero, but this limit is singular near routing ties. This paper studies that singularity at the population level for squared-loss MoE regression. The central object is the \emph{boundary mass}, namely the probability that the top two router scores are separated by only a small margin.