AI RESEARCH
The Loupe: A Plug-and-Play Attention Module for Amplifying Discriminative Features in Vision Transformers
arXiv CS.LG
•
ArXi:2508.16663v2 Announce Type: replace-cross Fine-Grained Visual Classification (FGVC) requires models to focus on subtle, task-relevant regions rather than broad object context. We present The Loupe, a lightweight plug-and-play spatial gating module for hierarchical Vision Transformers. The module is inserted at an intermediate feature stage, predicts a single-channel spatial mask with a small CNN, and uses that mask to reweight feature activations during end-to-end