cs.CV

Linearizing Vision Transformer with Test-Time Training

arXiv:2605.02772v1 Announce Type: new
Abstract: While linear-complexity attention mechanisms offer a promising alternative to Softmax attention for overcoming the quadratic bottleneck, training such models from scratch remains prohibitively expensive….