AI RESEARCH
Mixture of Layers with Hybrid Attention
arXiv CS.AI
•
ArXi:2605.09516v1 Announce Type: cross Standard Mixture-of-Experts (MoE) transformers route tokens to expert subnetworks within each layer, but the layer structure itself remains monolithic. We