MPM: Mutual Pair Merging for Efficient Vision Transformers
arXiv:2604.05718v1 Announce Type: new
Abstract: Decreasing sequence length is a common way to accelerate transformers, but prior token reduction work often targets classification and reports proxy metrics rather than end-to-end latency. For semantic s…