Nazmul Takbir, Hamidreza Alikhani, Nikil Dutt, Sangeetha Abdu Jyothi

FlexiCache: Leveraging Temporal Stability of Attention Heads for Efficient KV Cache Management

Nazmul Takbir, Hamidreza Alikhani, Nikil Dutt, Sangeetha Abdu Jyothi / April 21, 2026

arXiv:2511.00868v2 Announce Type: replace
Abstract: Large Language Model (LLM) serving is increasingly constrained by the growing size of the key-value (KV) cache, which scales with both context length and generation length. Prior work shows that atte…

Author name: Nazmul Takbir, Hamidreza Alikhani, Nikil Dutt, Sangeetha Abdu Jyothi

FlexiCache: Leveraging Temporal Stability of Attention Heads for Efficient KV Cache Management