Vishal Pandey, Gopal Singh

Variational Linear Attention: Stable Associative Memory for Long-Context Transformers

Vishal Pandey, Gopal Singh / May 13, 2026

arXiv:2605.11196v1 Announce Type: new
Abstract: Linear attention reduces the quadratic cost of softmax attention to $\mathcal{O}(T)$, but its memory state grows as $\mathcal{O}(T)$ in Frobenius norm, causing progressive interference between stored ass…

Author name: Vishal Pandey, Gopal Singh

Variational Linear Attention: Stable Associative Memory for Long-Context Transformers