Saif Mahmoud, Ahmad Almasri

Dispatch-Aware Ragged Attention for Pruned Vision Transformers

Saif Mahmoud, Ahmad Almasri / April 20, 2026

arXiv:2604.15408v1 Announce Type: cross
Abstract: Token pruning methods for Vision Transformers (ViTs) promise quadratic reductions in attention FLOPs by dropping uninformative patches. Yet when pruned sequences are executed with state-of-the-art vari…

Author name: Saif Mahmoud, Ahmad Almasri

Dispatch-Aware Ragged Attention for Pruned Vision Transformers