AI RESEARCH
Sink vs. diagonal patterns as mechanisms for attention switch and oversmoothing prevention
arXiv CS.AI
•
ArXi:2605.08453v1 Announce Type: cross This paper studies the role of sinks and diagonal patterns as attention switch and anti-oversmoothing mechanisms. We analyze geometric conditions under which sinks can be represented, showing a necessary alignment between the embedding of the sink and all other embeddings. Next, we refine the current understanding of the role of sinks in oversmoothing prevention: we specify the conditions under which dense attention provably smooths than sparse attention, and empirically verify that such conditions are often satisfied in practice.