AI RESEARCH
Coupled Query-Key Dynamics for Attention
arXiv CS.LG
•
ArXi:2604.01683v1 Announce Type: new Standard scaled dot-product attention computes scores from static, independent projections of the input. We show that evolving queries and keys \emph{jointly} through shared learned dynamics before scoring - which we call \textbf{coupled QK dynamics} - improves language modeling perplexity and