AI RESEARCH

Bit-Flip Vulnerability of Shared KV-Cache Blocks in LLM Serving Systems

arXiv CS.LG

ArXi:2604.17249v1 Announce Type: cross Rowhammer on GPU DRAM has enabled adversarial bit flips in model weights; shared KV-cache blocks in LLM serving systems present an analogous but previously unexamined target. In vLLM's Prefix Caching, these blocks exist as a single physical copy without integrity protection.