cs.CV, cs.LG, eess.SP

Discrete Cosine Transform Based Decorrelated Attention for Vision Transformers

arXiv:2405.13901v4 Announce Type: replace
Abstract: Self-attention is central to the success of Transformer architectures; however, learning the query, key, and value projections from random initialization remains challenging and computationally expen…