Bit-Flip Vulnerability of Shared KV-Cache Blocks in LLM Serving Systems
arXiv:2604.17249v1 Announce Type: cross
Abstract: Rowhammer on GPU DRAM has enabled adversarial bit flips in model weights; shared KV-cache blocks in LLM serving systems present an analogous but previously unexamined target. In vLLM’s Prefix Caching, …