Shadow in the Cache: Unveiling and Mitigating Privacy Risks of KV-cache in LLM Inference

ArXi:2508.09442v4 Announce Type: replace-cross The Key-Value (KV) cache, which s intermediate attention computations (Key and Value pairs) to avoid redundant calculations, is a fundamental mechanism for accelerating Large Language Model (LLM) inference. However, this efficiency optimization