AI RESEARCH
SWAA: Sliding Window Attention Adaptation for Efficient and Quality Preserving Long Context Processing
arXiv CS.AI
•
ArXi:2512.10411v5 Announce Type: replace-cross The quadratic complexity of self attention in Transformer based LLMs renders long context inference prohibitively expensive. While Sliding Window Attention (SWA), the simplest sparse attention pattern, offers a linear complexity alternative, it suffers from catastrophic long context performance collapse, which stems from two fundamental factors: the