KVSlimmer: Theoretical Insights and Practical Optimizations for Asymmetric KV Merging

ArXi:2603.00907v2 Announce Type: replace The growing computational and memory demands of the Key-Value (KV) cache significantly limit the ability of Large Language Models (LLMs). While KV merging has emerged as a promising solution, existing methods that rely on empirical observations of KV asymmetry and gradient-based Hessian approximations lack a theoretical foundation and incur suboptimal compression and inference overhead.