Efficient Remote KV Cache Reuse with GPU-native Video Codec
arXiv:2602.09725v3 Announce Type: replace-cross
Abstract: Remote KV cache reuse fetches KV cache for identical contexts from remote storage, avoiding recomputation, accelerating LLM inference. While it excels in high-speed networks, its performance de…