AI RESEARCH
Variational Linear Attention: Stable Associative Memory for Long-Context Transformers
arXiv CS.LG
•
ArXi:2605.11196v1 Announce Type: new Linear attention reduces the quadratic cost of softmax attention to $\mathcal{O}(T)$, but its memory state grows as $\mathcal{O}(T)$ in Frobenius norm, causing progressive interference between d associations. We