Pei-Chun Su - Provide.ai

eOptShrinkQ: Near-Lossless KV Cache Compression Through Optimal Spectral Denoising and Quantization

Pei-Chun Su / May 6, 2026

arXiv:2605.02905v1 Announce Type: new
Abstract: We show that the key-value (KV) cache in transformer attention heads admits a natural decomposition into a low-rank \emph{shared context} component and a full-rank \emph{per-token} residual, well describ…

Author name: Pei-Chun Su

eOptShrinkQ: Near-Lossless KV Cache Compression Through Optimal Spectral Denoising and Quantization