Linear-Time Global Visual Modeling without Explicit Attention
arXiv:2605.01711v1 Announce Type: new
Abstract: Existing research largely attributes the global sequence modeling capability of Transformers to the explicit computation of attention weights, a process that inherently incurs quadratic computational com…