AI RESEARCH

Summarize Before You Speak with ARACH: A Training-Free Inference-Time Plug-In for Enhancing LLMs via Global Attention Reallocation

arXiv CS.AI

ArXi:2603.11067v1 Announce Type: cross Large language models (LLMs) achieve remarkable performance, yet further gains often require costly