AI RESEARCH

Kwai Summary Attention Technical Report

arXiv CS.LG

ArXi:2604.24432v1 Announce Type: cross Long-context ability, has become one of the most important iteration direction of next-generation Large Language Models, particularly in semantic understanding/reasoning, code agentic intelligence and recommendation system. However, the standard softmax attention exhibits quadratic time complexity with respect to sequence length. As the sequence length increases, this incurs substantial overhead in long-context settings, leading the