AI RESEARCH
Shaken or Stirred? An Analysis of MetaFormer's Token Mixing for Medical Imaging
arXiv CS.CV
•
ArXi:2510.05971v3 Announce Type: replace The generalization of the Transformer architecture via MetaFormer has reshaped our understanding of its success in computer vision. By replacing self-attention with simpler token mixers, MetaFormer provides strong baselines for vision tasks. However, while extensively studied on natural image datasets, its use in medical imaging remains scarce, and existing works rarely compare different token mixers, potentially overlooking suitable designs choices. In this work, we present the first comprehensive study of token mixers for medical imaging.