cs.CV

DORA: Dynamic Online Reinforcement Agent for Token Merging in Vision Transformers

arXiv:2605.11683v1 Announce Type: new
Abstract: Vision Transformers (ViTs) incur significant computational overhead due to the quadratic complexity of self-attention relative to the token sequence length. While existing token reduction methods mitigat…