AI RESEARCH
Summarize Before You Speak with ARACH: A Training-Free Inference-Time Plug-In for Enhancing LLMs via Global Attention Reallocation
arXiv CS.AI
•
ArXi:2603.11067v1 Announce Type: cross Large language models (LLMs) achieve remarkable performance, yet further gains often require costly