RetentiveKV: State-Space Memory for Uncertainty-Aware Multimodal KV Cache Eviction
arXiv:2605.04075v1 Announce Type: new
Abstract: Multimodal Large Language Models face severe challenges in computational efficiency and memory consumption due to the substantial expansion of the visual KV cache when processing long visual contexts. Ex…