Expert Selections In MoE Models Reveal (Almost) As Much As Text

ArXi:2602.04105v2 Announce Type: replace We present a text-reconstruction attack on mixture-of-experts (MoE) language models that recovers tokens from expert selections alone. In MoE models, each token is routed to a subset of expert subnetworks; we show these routing decisions leak substantially information than previously understood.