Yifan Zhang, Zhen Qin, Quanquan Gu

Higher-order Linear Attention

Yifan Zhang, Zhen Qin, Quanquan Gu / May 14, 2026

arXiv:2510.27258v2 Announce Type: replace-cross
Abstract: The quadratic cost of scaled dot-product attention is a central obstacle to scaling autoregressive language models to long contexts. Linear-time attention and State Space Models (SSMs) provide …

Author name: Yifan Zhang, Zhen Qin, Quanquan Gu

Higher-order Linear Attention