Yoav Gelberg, Yam Eitan, Michael Bronstein, Yarin Gal, Haggai Maron

Training Transformers for KV Cache Compressibility

Yoav Gelberg, Yam Eitan, Michael Bronstein, Yarin Gal, Haggai Maron / May 8, 2026

arXiv:2605.05971v1 Announce Type: new
Abstract: Long-context language modeling is increasingly constrained by the Key-Value (KV) cache, whose memory and decode-time access costs scale linearly with the prefix length. This bottleneck has motivated a ra…

Author name: Yoav Gelberg, Yam Eitan, Michael Bronstein, Yarin Gal, Haggai Maron

Training Transformers for KV Cache Compressibility