Key-Value Means: Transformers with Expandable Block-Recurrent Compressed Memory
arXiv:2605.09877v2 Announce Type: replace-cross
Abstract: We present Key-Value Means (“KVM”), a novel block-recurrence for attention that can accommodate either fixed-size or growing state. Equipping a strong transformer baseline with fixed-size KVM a…